BIG DATA

Authors: Niwesh Koirala and Shashank Shreshta

Introduction

The 6th edition of “Data Never Sleeps” by Domo (Domo, 2017) states that "Over 2.5 quintillion bytes of data are created every single day, and it’s only going to grow from there. By 2020, it’s estimated that 1.7MB of data will be created every second for every person on earth." It is also surmised that 90% of the world’s data was only created in the last 2 years. We are literally generating gigabytes of data every data without realizing it, and massive internet companies exist to leverage these raw bits into usable information. Now is the next evolution of information age; it is the time of Big Data.

In 2005, Roger Mougalas from O’Reilly Media coined the term Big Data for the first time, only a year after they created the term Web 2.0 (Rijmenam, 2016). Big Data refers to a large set of data that is almost impossible to manage and process using traditional business intelligence tools. SAS software (SAS , 2019) defines Big Data as a collective term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis. Rather than just the collection of Big Data, it is its applications that make it a cornerstone for modern businesses.

The key characteristics of Big Data have been the three ‘V’s: Volocity, Volume and Variety, Those characteristics were first identified by Gartner analyst Doug Laney (Laney, 2001). Big Data usually deals with a large amount of data (volume), differing in formats (variety) i.e. data is not limited to just text and numbers but can also draws from images, videos and as such, and finally, the speed at which the data is collected and processed (velocity). Three more characteristics have been added to the set lately (Blacksell, 2017): Veracity, Value and Variability. Veracity deal with how much ‘noise’ i.e. irrelevant information is in the data collected, Value attributes to the overall potential of the data collected and finally Variability deals with the various ways the data can be used.

The information derived from these massive pools of data allows us to make accurate market predictions, user segmentation and customization of services. This vast application of Big Data across a number of fields has made it immensely appealing as well as raised valid questions on its usage. This paper will try to understand the business impacts, ethical questions and applicability of Big Data in the Nepali context and present cases that shows the development of Big Data.

Brief history and development

An immense volume of Data is created every single day. Although data has been created since the beginning of time, we are just realizing the quantum of data that was available from long back. To begin with, we can decode the history of the Big Data in three eras:

i. 16^th – 19^th Century

iii. 20^th Century

iv. 21^st Century

16^th – 19^th Century

In 1663, John Graunt was considered a pioneer when he quested for raising the awareness on the effect of bubonic plague. He recorded and analyzed information on the rate of mortality in London. He is also considered one of the first statisticians who used data to conclude his findings. His book, “Natural and Political observations made upon the bills of Mortality” does statistical analysis of data. We can technically observe this event as initial footprints of Big Data. Eventually in 1889, computing system invented by Herman Hollerith attempted to organize census data making a huge impact in computational technology history. Unbeknownst to Graunt and Hollerith, they would be laying the foundations to Big Data – categorization and analysis of vast volumes of unstructured data.

20^th Century:

The era of world war started and ended which pushed civilization and whole world into information age. IBM was contracted by then President of United States of America Franklin D. Roosevelt’s administration for creating track of millions of Americans in 1937. IBM in response develops a punch card reading system which helped in the accumulation of data. Later during world war in 1943 British engineer built a Colossus, a very first data processing machine in order to decipher Nazi codes. Subsequently, in 1952, National Security Agency (NSA) developed a machine which independently and automatically collect and process information. The rising popularity of data and seeing its potential American in 1965 established the first data center. The purpose of this data center was for storing millions of tax returns and fingerprints sets. This can be marked as the starting point of electronic big storage. The revolution in data consumption and availability was humungous when in 1989 Bernes lee invented World Wide Web. Entering at the end decade of 20^th century, the creation of data grew at an extremely high rate as more devices gained capacity to access the internet.

21^st Century:

The era of evolution to rapid informational age was a boon to Big Data concept. There is analogy of data being created it says that since the beginning of time up-to year 2003 we have 5 exabytes of data stored. But to the surprise, we are creating that volume of data every 2 days since 2016. In 2005, Roger Mougalas coined the term ‘Big Data’ to signify the massive amounts of data that we had began generating. The sheer volume of raw data also posited the question, “How do we turn Big Data into information?” The solution lied in the same year - 2005, with a discovery which is considered a turning point for a field of Big Data. This is the year when Yahoo launched the open source platform Hadoop. Initially created by Yahoo to index the entire web, Hadoop became the solution for processing the vast ocean of data we had now begun generating. Today, Hadoop is used by millions of business around the world to go through the colossal amounts of data. The data analytics since then have seen the remarkable changes around the world. Till this day, we are witnessing the march towards yet un-scaled horizon of Big Data and its Analytics.

Workflow, Benefits and Challenges

A Big Data Ecosystem needs a robust workflow to properly function. Creating a workflow that suits the business and its goals is essential in actually reaping the benefits of becoming Data-driven. According to Harvard Business Review (Randy Bean, 2019), 40.3% identify lack of organization alignment and 24% cite cultural resistance as the leading factors contributing to the failure to adopt data-driven workflows. Alon Lebenthal identifies a functional Big-data workflow having the following 4 steps (Lebenthal, 2018):

· Ingesting data

· Storing the data

· Processing it

· Making data available for analytics

The ingesting of the data i.e. data acquisition can be done from a number of sources. In the current social media age, our cellphones have become hubs for data acquisition for online companies. Once acquired, the data is then stored and processed for analysis. Traditionally, large storage units may have been required for this and hence, been too costly for most companies.

However, hardware and software costs are reducing and becoming more powerful, companies can even take advantage of cloud computing services so to do all the data crunching. Data centers can distribute batches of data for processing over multiple servers, and the number of servers can be scaled up or down quickly as needed. This scalable distributed computing is accomplished using innovative tools like Apache Hadoop, MapReduce and Massively Parallel Processing (MPP). Similarly, NoSQL databases have been developed as more easily scalable alternatives to traditional SQL-based database systems.

The data hence processed can be analyzed as per the user’s parameters and visualized to detect trends, find patterns and most importantly, for strategic impact. The most common use of Big Data has been target ads and customized products and service but it extends to a much larger dimension. Information can be analyzed to find better business routes, monitor traffic congestions and create better road management models, marketing agencies can use it to create customer profiles and at its most sensitive, Big Data analytics can even be used to spy on others. Once such breach of privacy occurred with Cambridge Analytica and Facebook, where it was revealed that Cambridge Analytica had acquired large amount of data from nearly 86 million Facebook users and used them to create voting profiles for said users – sending them targeted ad to influence them into voting for or against a certain political candidate, the candidate that Cambridge Analytica was hired to help, is Now the President of the United States, Donald Trump (BBC, 2018).

The challenges of using Big Data can be identified in two groups – Data Complexity and Computational Complexity. These topics are briefly discussed as follows:

Data complexity:

· The emergence of Big Data has provided us with unprecedented large-scale samples which lead us to face far more complex data objects.

· The typical characteristics of Big Data are diversified types and patterns, complicated inter-relationships, and greatly varied data quality. The inherent complexity of Big Data (including complex types, complex structures, and complex patterns) makes its perception, representation, understanding and computation far more challenging and results in sharp increases in the computational complexity when compared to traditional computing models based on total data.

· Traditional data analysis and mining tasks, such as retrieval, topic discovery, semantic analysis, and sentiment analysis, become extremely difficult when using Big Data.

· We lack knowledge regarding the laws of distribution and association relationship of Big Data.

· We lack deep understanding on the inherent’s relationship between data complexity and computational complexity of Big Data, as well as domain-oriented Big Data processing methods.

· A fundamental problem is how to formulate or quantitatively describe the essential characteristics of the complexity of Big Data.

Computational Complexity:

· New approaches will need to break away from assumptions made in traditional computations

· When solving problems involving Big Data, we will need to re-examine and investigate its computability, computational complexity, and algorithms.

· New approaches for Big Data computing will need to address Big Data-oriented, novel and highly efficient computing paradigms, provide innovative methods for processing and analyzing Big Data, and support value-driven applications in specific domains.

· New features in Big Data processing, such as insufficient samples, open and uncertain data relationships, and unbalanced distribution of value density, not only provide great opportunities, but also pose grand challenges, to studying the computability of Big Data and the development of new computing paradigms.

· There is a massive hurdle in terms of ROI and unless the Big Data initiative is tied to company onjectives and goals, the information obtained will only be scientific data that is not actionable. The Data driven initiative must be made with the company and its objectives in mind.

Business Impact

The literature review was carried out for the Big Data and Big Data analytics. For our report we have included research from some of the top journals, conferences, and white papers by industry around the world. Though enough information was found on Big Data, we found it little difficult to retrieve data about Big Data Analytics. We found the most of the research was carried out in academia. Here we present the benefits in business due to Big Data with relevant literature supporting it. The Big Data has been changing and transforming the way we live, work and think (V. Mayer-Schonberger, 2013). The business impact Big Data has been able to bring in the world are as follows:

Impact in National development:

Depth analysis and utilization of Big Data plays an important role in promoting sustained economic growth of countries and enhance the competitiveness of companies. In the future, Big Data will eventually become a new point of economic growth. With Big Data, companies will be able to upgrade and transform to the mode of Analysis as a Service (AaaS), thereby changing the ecology of the IT and other industries.

At the national level, the capacity of accumulating, processing, and utilizing vast amounts of data will become a new landmark of a country’s strength. The data sovereignty of a country in cyberspace will be another great power-game space besides land, sea, air, and outer spaces. The Western countries, represented by the United States, are moving under their national agenda towards a modernization of their national strength through Big Data research and applications. It is anticipated that future economic and political competitions among countries will be based on exploiting the potential of Big Data, among other traditional aspects.

Impact in Industrial upgrades:

Big Data is currently a common problem faced by many industries. Everyone in the industry hopes to mine from Big Data extracting the information, knowledge and even intelligence and ultimately taking full advantage of the big value of Big Data. It has become actually a key product for getting relevant decision making tools and strategies rather than being a byproduct. Big Data and its analytics is a new engine to sustain the high growth of the information industry, but also the new tool for industries to improve their competitiveness. For example cloud computing provides the IT infrastructure to Big Data and Big Data is an application of cloud computing. So the industry based on active decision based on off grid situation can get benefit from Big Data and its analytics.

Impact to scientific research:

Big Data has caused the scientific community to re-examine its methodology of scientific research (J. Hey, 2009) and has triggered a revolution in scientific thinking and methods. It is well-known that the earliest scientific research in human history was based on experiments. Later on, theoretical science emerged, which was characterized by the study of various laws and theorems.

However, because theoretical analysis is too complex and not feasible for solving practical problems, people began to seek simulation-based methods, which led to computational science. The emergence of Big Data has spawned a new research paradigm; that is, with Big Data, researchers may only need to find or mine from it the required information, knowledge and intelligence. Turing Award winner, Jim Gray, believed that the fourth paradigm may be the only systematic way for solving some of the toughest global challenges we face today. In essence, the fourth paradigm is not only a change in the way of scientific research, but also a change in the way that people think (V. Mayer-Schonberger, 2013).

Impact to multidisciplinary researches:

Big Data technologies and the corresponding fundamental research have become a research focus in academia. An emerging interdisciplinary discipline called data science (Data Science, 2014) has been gradually coming into place. This takes Big Data as its research object and aims at generalizing the extraction of knowledge from data. It spans across many disciplines, including information science, mathematics, social science, network science, system science, psychology, and economics (Loukides, 2011) (C. O'Neil).. It employs various techniques and theories from many fields, including signal processing, probability theory, machine learning, statistical learning, computer programming, data engineering, pattern recognition, visualization, uncertainty modeling, data warehousing, and high performance computing.

Many research centers/institutes on Big Data have been established in recent years in different universities throughout the world (such as the University of California at Berkeley, Columbia University, New York University, Tsinghua University, Eindhoven University of Technology, and Chinese University of Hong Kong). Lots of universities and research institutes have even set up under-graduate and/or postgraduate courses on data analytics for cultivating talents, including data scientists and data engineers

Implementations

Big Data analytics has permeated into multiple sectors and applications. As our discussion surmised, the key strength of Big Data is using the massive data collected into creating analytical information that can drive business and strategic decisions. According to strategic advisors NewVantage Partners (Osborne, 2019), the “fear of disruption” by rivals using data and AI to their advantage will be fueling investments in Big Data by competitive firms as 2019 moves forward. The study further states that the survey showed the reason for investment as: 75% fear disruption from new entrants, and 88% feel greater urgency to invest in Big Data and AI. 92% are driven by positive objectives—transformation, agility, or competition—and only 5% are driven by cost reduction.

According to a Forrester TechRadar Study of Big Data based technology (Press, 2016), MPP data warehouse, Predictive Analysis, Data Visualization and Distributed File storage are some the most significantly successful technologies with most them reaching the next phase of development in the next 3 to 5 years.

In light of these developments in the technical side of thing, it bears importance to see how practical applications of Big Data are being implemented in the business sector and how successful they have been.

Dr. Pepper Snapple Group (DPSG) has utilized machine learning and predictive analysis tools in its Big Data Plaform MyDPS, boosting its productivity and revenue. In a testimonial for the platform (Symphony Retail, 2018), John Williams, Director of Category management, DPSG said, “By using aisle optimization technology, we have increased our margin dollars by $1.4 million.”

The challenges DPSG was facing was two-fold, their information flow to their sales route was voluminous - large binders filled with customer data, sales notes and promotions, and secondly, with the consumer preferences changing, they needed to keep up with the market and identify category growth categories. Utilizing MyDPS and SR Assortment and Space solution, they were able to consolidate these problems and use the massive amount of data they had into strategic business insights.

MyDPS was initially tested in isolated DPSG branches. According to a NetworksAsia case study (Boulton, 2017), the sales staff that used the platform reported a 50 present increase in sales. This motivated Tom Farrah, CIO of DPSG, to implement the platform company-wide, “Our Sales Route staffs were glorified order takers. Now, they are becoming intelligent sales people equipped with information to achieve their goals,” He remarked. The platform is equipped with machine learning and analytic tools that funnels recommendations and a daily scorecard to workers showing expected projections, their sakes track and insights of correcting course if needed.

Rolls-Royce Holdings is another prominent name that is using Big Data to their competitive advantage. In a Forbes Report (Marr, 2015)), Paul Stein, the company’s chief scientific officer, said: “We have huge clusters of high-power computing which are used in the design process.We generate tens of terabytes of data on each simulation of one of our jet engines.” The chief areas Rolls-Royce is using Big Data in their operations are: design, manufacture and after-sales support.

The terrabytes of design data is processed into design visualization and evaluations. This allows them to simulate the design perfomance in extreme conditions, taking the need to practically test these out of the equation bring both performance and reduced testing costs. The company’s manufacturing systems are also networked and communicated with each other in an Internet of Things environment.

An example of this can be seen in what Rolls Royce refers to as its Ship Intelligence initiative (Marr, 2015). Developed with the VVT Technical Research Center of Finland, the initiative automates security processes and gives the commanding crew of the ships with a digital dashboard. It also enables the craft with sophisticated Big Data-driven automatic piloting and operating systems. Hazards detected by sensors can be highlighted to the crew right in front of their eyes by augmented reality (AR) displays, and the ship can automatically plot a safe path.

So efficient has Rolls-Royce been in their data driven initiatives that it has started becoming a product in itself. In 2015, Rolls-Royce inked a 5-year deal with Singapore Airlines to provide the airline with its TotalCare civil aerospace software (Murphy, 2015). The software provides fuel consuption, on-board system monitoring, flighting planning, operations control and engineering systems and is projected to significantly cut fuel consumption in aircrafts.

However, not all have been successful in their big-data initiatives. One of the biggest failures in Big Data initiatives remains with one of the frontrunners - Google. In 2008, Google launched Google Flu Trends (GFT) with the aim to predict future disease outbreaks and trends at a margin of the price such models take. In 2015, the service closed down amidst massive criticisms regarding its accuracy and privacy issues stemming from its data aggregation. According to Lazer, Kennedy, King and Vespigini (David Lazer R. K., 2014), GFT was predicting more than double the proportion of doctor visits for influenza-like illness (ILI) than the Centers for Disease Control and Prevention (CDC), which bases its estimates on surveillance reports from laboratories across the United States. This happened despite the fact that GFT was built to predict CDC reports. According to a Harvard research paper, even after Google Flu Trends was updated in 2009, the comparative value of the algorithm as a stand-alone flu monitor was questionable. A study in 2010 demonstrated that GFT accuracy was not much better than a fairly simple projection forward using already available (typically on a 2-week lag) CDC data (4).

Google Flu Trends symbolizes the biggest problem researcher have stated about Big Data, most Big Data trends that have received popular attention are not the output of instruments designed to produce valid and reliable data amenable for scientific analysis. Google Flu Trends also did not share its data with others, as reported by Wired (David Lazer R. K., 2015), “while Google’s efforts in projecting the flu were well meaning, they were remarkably opaque in terms of method and data—making it dangerous to rely on Google Flu Trends for any decision-making.”

Google Flu Trends closing makes it a cautionary tale (David Lazer R. K., 2014), sparking the term “Big Data Hubris.” The value of the data held by entities like Google is almost limitless, which also means those holding these data have a responsibility to use it in the public’s best interest. Being both opaque and not being able to forecast the data accurately spelled the end for Google flu trends turning it, as Wired stated, “from the poster child of Big Data into the poster child of the foibles of Big Data.”

Practicalities and Potential in Nepal

The use of Big Data in Nepal is still in its nascent stage. According to “Nepal’s emerging data revolution” by Development Initiatives (Rana, 2017), Nepal’s 27 ministries have digitised their day-to-day operations, and about half of Nepal’s 7,000 government offices are now reported to be computerised, paper based systems of data collection and management are still common. “The problem with Big Data in Nepal, as with many technologies, is that it is still in the hype phase,” said Prabin Joshi, CTO of Rooster Logic, a Kathmandu-based data research firm, “The next problem is data collection, there is not enough collected and what is there, is not in proper structured format. It will still take a while to properly digitize and structure the data to make anything of it.”

In 2014 a research project coordinated by the Open Data in Developing Countries programme set out to explore the emerging impacts of open data in Nepal. The general lack of open data was quickly discovered (Rana, 2017). The project also stated that data that meets the needs of decision-makers and accountability actors is not available: data is not disaggregated, there are significant data gaps, it is not timely and different datasets lack interoperability due to lack of standards (Rana, 2017).

Despite the lack of proper applications, the potential sectors for use remain promising. United Nations has recently stressed on the use of Data for development and the achievement of the Sustainable Development Goals:

“Big Data analysis techniques could be adopted to gain real-time insights into people’s wellbeing and to target aid interventions to vulnerable groups. New sources of data, new technologies, and new analytical approaches, if applied responsibly, can enable more agile, efficient and evidence-based decision-making and can better measure progress on the Sustainable Development Goals (SDGs) in a way that is both inclusive and fair.” (United Nations, 2017)

Ajay Ohri of the IBM Big Data Initiative & Analysis Hub (Ohri, 2015) recognized financial services, Agriculture, Education, Healthcare, Corruption reduction and Carbon consumption optimization as major areas where Big Data technologies can have a significant impact. Similarly, Pratima Pradhan and Subarna Shakya discuss the possiblity of using Big Data in e-governance akin to India:

“In India the “Adhaar" card [3, 5, 10] was introduced as a unique identifier for transparent citizen bene ts. This card could hold the key to verification for multiple purposes. Nepal could benefit for passport, taxation, and license and citizen benefit distribution” (Pratima Pradhan, 2018)

The paper additionally identifies disaster relief as a key area that Big Data technologies can help. Taking the 2015 earthquake as an example, the use of identification cards akin to the Indian “Adhaar” card can help consolidate relief clusters, rescue strategies and rolling out of support materials and capital. In fact, Kathmandu Open Labs was rcognized internationally (Sinha, 2015) for aiding in the initial days of disaster relief during the 2015 Earthquake through its open source mapping services.

The outsourcing market has seen immense potential in Nepal however. Dovan Rai (Rai, 2017) recognizes the following companies in her report on Big Data:

● YoungInnovations creates automated data tools and data platforms

● CraftData Labs works with business data along with open-data for governance a

● GrowByData specializes on Big Data for e-commerce.

● Grepsr provides data scraping solutions.

● Fusemachines creates automated sales platform using Big Data.

● Deerwalk provides Big Data solutions to healthcare industry.

● LeapFrog Technology offers healthcare data solutions as one of their technology services.

● Javra Software works with Big Data to create business intelligence tools.

This is, by no means, an exhaustive list of all the companies working on outsourced Big Data solutions. It does stand to notice that Nepali Companies has stepped up to the plate of being trained in and learning the architecture of Big Data technologies. “There is, of course, potential in Nepal, but so much of the collected data is in offline format and that is the hurdle,” said Aakar Anil, member of CloudFactory. A data processing company with a focus on Natural Language Processing, CloudFactory has converted the Big Data process workflow by creating a workforce that can analyse tons of information they receive from their clients. “Our workers work online and they analyse the data provided as per the client specifications. Some information requires Human observation and our platform uses our cloudworkers (online workers) to process the data.” CloudFactory currently employs more than 8,000 workers to process the data needs of more than 150 A.I., NLP and automation projects for global companies like Microsoft, Drive.ai, Ibotta and nuTonomy (CloudFactory, 2016).

Currently, the cellphone/telecommunication penetration of Nepal is high; with Nepal Telecommunications estimating about 60% of the population has access to a cellphone (Nepal Telecom, 2019). OnlineKhabar states that the actual ownership data of cellphones actually states that the number of Cellphone users in Nepal is 34% more than the actual population (OnlineKhabar, 2018) obviously caused due to a single user owning multiple cellphones. Leveraging off of the penetration of telecommunications, the opportunities for Big Data driven initiatives are plentiful. “The biggest platforms I see are Health, Education and Agriculture,” said Pravin Joshi of Rooster Logic, “Nepal needs infrastructure and educational reforms and data driven analytics can elevate both development and commercial organizations in the sector.” The potential is there, but the what must happen is a long-term vision on how to access the potential and not merely jump aboard the hype train.

Conclusions

As mentioned above, it was found that Big Data analytics can provide vast horizons of opportunities in various applications and areas, such as customer intelligence, fraud detection, and supply chain management. Additionally, its benefits can serve different sectors and industries, such as healthcare, retail, telecom, manufacturing, etc. However, Big Data is also very difficult to deal with. It requires proper storage, management, integration, federation, cleansing, processing, and analyzing. All the problems we face with traditional data management, Big Data exponentially increases these difficulties due to additional volumes, velocities, and varieties of data and sources which have to be dealt with.

We saw that Big Data analytics is of great significance in this era of data surplus, and can provide unforeseen insights and benefits to decision makers in various areas. If properly harnessed and applied, Big Data analytics has the potential to provide a basis for advancements, on the scientific, technological, and humanitarian levels. Industries are already applying the Big Data for its advantages in huge degree. According to the figure Alibaba disclosed in March 2014, their data center has, so far, stored more than 100 PB of processed data, which amounts to 100 million high-resolution movies. During the just past “Singles’ Day” (also known as “Double 11 Day”), Alibaba pulled in around 278 million orders. For this annual shopping event, Alibaba developed a real-time data processing platform called Galaxy, which could handle 5 million transactions per second. The total amount of data that Galaxy can process every day is about 2 PB. Industry is more successful in this respect because it has two essential driving forces: they really need to possess Big Data in real time and they have the requirements on making better use of the data collected.

However, Big Data requires more clarity of one’s own business and also some ethics driving its use. As seen from Cambridge Analytica’s case, Big Data makes it easy to manipulate one’s perspectives and such unethical profiling is not marketing or business strategy but fraud of the highest order. Data driven approach to business or even development can yeild massive benefits but it must be focused, tailored to one’s objective and used in an ethical manner.

References:

· BBC. (2018, 04 05). How the Facebook-Cambridge Analytica data scandal unfolded. Retrieved 05 29, 2019 from British boardcasting System: https://www.bbc.com/news/av/technology-43650346/how-the-facebook-cambridge-analytica-data-scandal-unfolded

· Blacksell, T. (2017, 06 2). The evolution of big data – the ‘6 Vs’. Retrieved 05 24, 2019 from experian home page: https://www.experian.co.uk/blogs/latest-thinking/identity-and-fraud/the-evolution-of-big-data-the-6vs/

· Boulton, C. (2017, 10 03). 6 data analytics success stories. Retrieved 05 28, 2019 from NetworksAsia: https://www.networksasia.net/article/6-data-analytics-success-stories-inside-look.1506994161/page/0/2

· C. O'Neil, R. S. (n.d.). Doing Data Science: Straight from the Frontiline. From O'Reilly Media Inc, 2013.

· CloudFactory. (2016, 05 01). Careers at Cloudfactory. Retrieved 05 27, 2019 from Cloudfactory: https://cloudfactory.breezy.hr

· Data Science. (2014). From Wikipedia: http://en.wikipedia.org/wiki/Data_science

· David Lazer, R. K. (2014, 03 14). The Parable of Google Flu: Traps in Big Data Analysis. Science , 1203.

· David Lazer, R. K. (2015, 01 10). WHAT WE CAN LEARN FROM THE EPIC FAILURE OF GOOGLE FLU TRENDS. Retrieved 05 23, 2019 from Wired: https://www.wired.com/2015/10/can-learn-epic-failure-google-flu-trends/

· Domo. (2017). Data Never Sleeps 6.0. East Utah: Domo Operating Systems.

· J. Hey, S. T. (2009). The Fourth Paradigm: Data-Intensive Scientific Discovery. Microsoft Corporation .

· Laney, D. (2001). 3D Data management. Stamford: META Delta.

· Lebenthal, A. (2018, 07 09). Five Reasons You Need a Step-by-Step Approach to Workflow Orchestration for Big Data. Retrieved 05 24, 2019 from bmcBlogs: https://www.bmc.com/blogs/five-reasons-you-need-a-step-by-step-approach-to-workflow-orchestration-for-big-data/

· Loukides, M. (2011). What is Data Science? From O' Reilly Media Inc.

· Marr, B. (2015, 06 01). How Big Data Drives Success At Rolls-Royce. Retrieved 05 22, 2019 from Forbes: https://www.forbes.com/sites/bernardmarr/2015/06/01/how-big-data-drives-success-at-rolls-royce/#4c7778d41d69

· Murphy, M. (2015, 06 30). Singapore Airlines inks 5-year deal with Rolls Royce for fuel efficiency data analytics. (IDG) Retrieved 05 26, 2019 from ComputerWorld Uk: https://www.computerworlduk.com/data/rolls-royce-provide-airlines-with-fuel-efficiency-data-analytics-3618027/

· Nepal Telecom. (2019, 04 17). Smartphone penetration in Nepal and the impact. Retrieved 05 27, 2019 from Nepal Telecom: (https://www.nepalitelecom.com/2018/03/smartphone-penetration-nepal-and-the-impact.html).

· Ohri, A. (2015, 03 06). Big Data Initiatives in Developing Nations. Retrieved 05 28, 2019 from IBM big data & analytics hub: https://www.ibmbigdatahub.com/blog/big-data-initiatives-developing-nations

· OnlineKhabar. (2018, 08 30). Number of mobile phone users in Nepal is 34% higher than population. Retrieved 05 28, 2019 from OnlineKhabar.com: https://english.onlinekhabar.com/number-of-mobile-phone-users-in-nepal-is-34-higher-than-population.html

· Osborne, C. (2019, 01 03). Fortune 1000 to ‘urgently’ invest in Big Data, AI in 2019 in fear of digital rivals. Retrieved 05 27, 2019 from ZDNet: https://www.zdnet.com/article/fortune-1000-to-urgently-invest-in-big-data-ai-in-2019-in-fear-of-digital-rivals/

· Pratima Pradhan, S. S. (2018). Big Data Challenges for e-Government Services in Nepal . Journal of the Institute of Engineering .

· Press, G. (2016, 03 14). Top 10 Hot Big Data Technologies. Retrieved 05 21, 2019 from Forbes: https://www.forbes.com/sites/gilpress/2016/03/14/top-10-hot-big-data-technologies/#6081ffdb65d7

· Rai, D. (2017, 12 22). Interested in big data in Nepal? See what’s happening. Retrieved 06 25, 2019 from Chautari: https://chautaari.com/big-data-nepal/

· Rana, L. D. (2017). Nepal’s emerging data revolution. Kathmandu: Development Initiatives.

· Randy Bean, T. H. (2019, 02 05). Companies Are Failing in Their Efforts to Become Data-Driven. Retrieved 05 20, 2019 from Harvard Business Review: https://hbr.org/2019/02/companies-are-failing-in-their-efforts-to-become-data-driven

· Rijmenam, D. M. (2016, 01 07). A Short History Of Big Data. Retrieved 06 1, 2019 from Datafloq.com: https://datafloq.com/read/big-data-history/239

· SAS . (2019, 05 24). Big Data: What it is and why it matters. Retrieved 05 28, 2019 from SAS: the power to know: https://www.sas.com/en_us/insights/big-data/what-is-big-data.html

· Sinha, S. (2015, 06 01). 3 Ways Nepalis Are Using Crowdsourcing to Aid in Quake Relief. Retrieved 06 01, 2019 from The New York Times: https://www.nytimes.com/2015/05/02/world/asia/3-ways-nepalis-are-using-crowdsourcing-to-aid-in-quake-relief.html

· Symphony Retail. (2018, 04 04). Dr Pepper Snapple category optimization gains big results. Retrieved 05 24, 2019 from Symphony Retail: Customer Success Stories: https://www.symphonyretailai.com/customer-success-stories/dr-pepper-snapple-group/

· United Nations. (2017, 01 05). Big Data for Sustainable Development. Retrieved 05 27, 2019 from United Nations" Global issues: https://www.un.org/en/sections/issues-depth/big-data-sustainable-development/index.html

· V. Mayer-Schonberger, K. C. (2013). Big Data: A Revolution that will transform how we live, Work, and think. Houghton Mifflin Harcourt.

Niwesh Koirala

Sunday, June 30, 2019

Big Data