A REVIEW ON OIL AND GAS ORGANIZATION’S DATA LAKES IMPLEMENTATION BEST PRACTICES
Abstract
Business organizations have leveraged on data analytics since 1950s, but in recent years, the
trend has been increasing mainly due to the availability of fast and efficient computation
nowadays. This has encouraged the emergence of big data analytics. One of the main
components in big data analytics is the ability of the framework/architecture to manage scalable
data and different types of data i.e. whether it is structured, semi-structured or unstructured
date. This has become the reason for the introduction of data lakes; to support massive scalable
data storage, which can hold the three types of data mentioned above. However, despites their
significant benefits, many have found that data lakes eventually turned into data swamps. This is
because data lakes are normally implemented without consideration on the necessary
fundamentals to operationalize the generated insights. Therefore, we initiated a research to
investigate the data lakes implementation best practices. The case study for this research is oil
and gas organization. In this paper, we present a review on data lakes implementation best
practices for oil and gas organizations. The review will support our research in developing a
guideline containing best practices of data lake implementation to support oil and gas big data
initiative.