Dr. Anu Rai, Dr. Rajib Sarkar, Ms. Kasturi Mukherjee, Department of Geography (School of Basic and Applied Sciences, Adamas University)
In a simple language big data is larger, more complex data sets, which are hard to manage through traditional data processing environment or techniques. However this massive voluminous data have opened the new horizon of informed-decision making.
6Vs of Big Data
A précised dimension of “big” is subtle. In the era of digital galaxy, Petabyte, Exabyte and Zettabyte, considered large scale today will likely seem small-scale in not so distant future. So the size alone cannot define the dimension of big data, which is highly complex and dynamic in nature. Big data often characterised by its 6Vs.
- Volume: The huge measures of information that is created each second/minute/hour/day in the digitized world.
- Velocity: This alludes to the speed at which information is being created and the pace at which information moves starting with one point then onto the next.
- Variety: This eludes to the ever-expanding various structures that information can come in, e.g., text, pictures, voice, video and geospatial.
- Veracity: This illustrates to the nature of the information, which can differ enormously.
- Valence: This refers to how huge information connected with each other.
- Value: Hidden values from insight must be gained after processing.
The aforesaid Vs represent distinct dimensions of big data. We have huge amount of data generating from different sources, in different formats with varying quality and unexplored value.
Big data in geography
Digital revolution of 21st century has dramatically reduced the cost of data management and computation. Higher interest in understanding the dynamicity of the geographical world around us, led to the growth of empiricism in geography. Empiricism advocates that human knowledge-construction must be based on inductive logic, tested with statistics on the basis of hypothesis. In this digital paradigm, geographical findings are often obtained through computer simulation which is explored from multi-source observations. With the advancement in application of computer and communication technologies, the geographical phenomena are increasingly observed in digital form from multiple sources. There exist three common sources of big data generation in geography:
- Machine: In geographical studies, Machine data is often been considered as Sensor data. Data generated by earth observation satellites, weather satellites, data produced by navigational satellites or location data produced by GPS output or RFID chip readings etc. Machine generated data are produced at much faster rate than any other means of data sources. NASA is considered as one of the biggest generators of spatial data, approximately more than 12.1TB per day. It is estimated that at the end of the 2025, the volume of data in the NASA’s Earth Observing System Data and Information System (EOSDIS) archive is projected to be 250 PB.
- Organization: Data generated by organization includes the collective data gathering process of institution like smart city data or data generated by national monitoring agencies or demographic and location data collected by big merchants like Amazon, Flipkart etc. The rate of data produced by organization is slower than other sources, however both machine and organization generated big data are more structured in format.
- People: Large voluminous and most unstructured data is generated by People in every minute through various social media platforms. It is estimated that almost 300 million new social media users join every year that 550 new users every minute. Since 2013, the number of tweets each minutes increased to 58%, more than 474,000 tweets recorded per minute in 2019. 400 hours of video uploaded to Youtube every minute. Instagram users upload over 100 million photos and video every day and 69,444 posts every minute. Every minute on Facebook: 510,000 comments are posted, 293,000 statuses are updated, and 136,000 photos are uploaded. Facebook users also click the like button on more than 4 million posts every minutes and the Facebook like button has been pressed 13 trillion times. Over 3.5 Billion Google searches are conducted worldwide each minute of every day. It is over 40,000 search queries per second. Devices are a huge source of the data we create everyday and it is not only mobile devices, but our smart TVs, smart watches, cars, or the increasing application of Internet of Things (IoT) technology is producing an ever increasing amount of data at individual level. In geographical domain, each person is considered as a sensor on the Internet, that create big data for public interest either in form of volunteered geographic information (VGI) or crowd-sourcing data. However the data created by people is more unstructured and need special attention to analyse.
Role of big data in informed decision-making
The growth in data and the way it can be used is rapidly changing the the way to analyse and understand the geospatial problems. These huge complex data is helping the decision-makers to understand trends, uncover hidden patterns and detect anomalies behind the scene. The digital universe of complex data, no doubt has enhanced the decision-making capacity of the stakeholders in various fields of geographical interests.
- Environmental monitoring and Big data
Since the inception of space technology, big data is playing crucial role in monitoring, predicting and managing environmental issues through its hundreds of missions and thousands of sensors and satellites roaming around the earth and space. Big sensor data is produced by various space agencies across the globe including Indian Institute of Remote Sensing, China National Space Administration, European Space Agency , Russian Federal Space Agency (Roscosmos), Japan Aerospace Exploration Agency (JAXA), German Aerospace Center (DLR), Italian Space Agency (ASI), National Centre for Space Studies (France), Canadian Space Agency and many others in the list. The big data archive produced by the earth observation system brought the revolution in the monitoring the geographical phenomena, assessing environmental quality, protecting biodiversity and habitat and most importantly adopting quick response in managing hazards and disasters.
- Big data in disaster management
Crisis Mapping Toolkit (integrated big data platform for disaster management that uses diverse data sources) has reshaped the disaster relief pattern around the globe. The first major event that utilized crisis mapping through big data incorporation was the 2010 Haiti earthquake, which killed and injured hundreds of thousands of people damaged the infrastructure badly. People in search of help, started mapping the basic infrastructure, especially in OpenStreetMap, and received better access to resources at much faster rate. Since then, crisis mappers have played significant role in mapped events in Libya (refugees), Japan (crowdsourcing and radiation monitoring for 2011 Tōhoku earthquake and Tsunami), Chile (Humanitarian response to the 2010 Chile earthquake), Pakistan (2010 Pakistan floods, 2011 floods), Somalia (refugees), Alabama (2011 Super Outbreak), Philippines (Typhoon – Haiyan in 2013) India (Flood, 2013; Cyclone 2013) Nepal (Earthquake 2015) and dozens of smaller disasters and events around the world.
- Big data in smart city
Feasibility of the smart city initiative is dependent on the big data and evolution of IoT. Big data empowers cities by offering large amount of data collected through various sensors including earth observation system and IoT. IoT incorporates the integration of sensors, radio-frequency indentification and Bluetooth in the real world environment applying higher dense networked services. The continuous application of big data in smart grid system, smart transportation services, smart healthcare services and in smart governance dramatically improving cost effective energy consumption pattern, city pollution, traffic control, safety etc. Many big data merchants are invited to contribute to further enhancement of smart city infrastructure. For instances, recently UBER announced to release 6 years of transportation data to cities to help them in planning public transit. In 2019, UBER served 75 million passengers with 3.9 million drivers with 14 million UBER trips every day.
- Big data in crime prediction
Big data has been used to predict crimes before they happen, “predictive policy” trail in California was able to identify areas where crime will occur three times more accurately than existing methods of forecasting. Predictive policy is especially useful in property related misdemeanors like auto robbery and thievery, where patterns can be detected. In Tennese, officials said serious crimes fell 30 percent and violent crimes declined 15 percent since executing predictive analytics. In South Africa few banks have used predictive models to predict next hit on ATM where criminals use explosives to open ATM’s.
- How big data fights back against COVID 19?
We all know how big data analytics played crucial role in tracking COVID-19 from global to local scale. It is the capabilities of big data analytics which enabled the faster response in identifying the trend, pattern and more vulnerable communities that further empowered policy-makers in adopting pro-active role in controlling the spread of this global pandemic. South Korea, Taiwan, Germany presented marvelous examples and showcase the impactful application of big data in controlling the spread of COVID-19.
Future of Big Data Analytics Market for Geographers
Acknowledging the role of big data in spatial decision making, big data statistics, analysis and reporting techniques are widely incorporated in recent curriculum of geography. Some prominent big data technologies like Machine learning, Apache Storm; Hadoop and Map Reduce technology, distributed file system with Hive and HBase, data mining tools like RapidMiner, Weka, Orange, Knime or data visualization tools like R, Python, Hadoop, Appache storm, NO SQL, Neo4j, Cassendra etc have been incorporated in the core course structure of Geography. Such updated course curriculum of Geography has revived the new skyline of openings for work to big data geographers.
Recently a report published by a private firm estimates that the global geospatial analytics market size was valued at USD 51,700.7 million in 2018, registering a CAGR of 15.0% from 2019 to 2025. It is assumed that increasing penetration of IoT along with Artificial Intelligence (AI) coupled with big data market will be the major driving shafts to geospatial industry. As geospatial information systems can be easily integrated into an organization’s existing enterprise information systems, it has triggered the growth market of GIS specialists. One of the most controlling variables hampering the market development incorporate operational concerns, for example, such as lack of skilled human resources who are efficient in handling open source tools and massive amount of unstructured data. With all the information created from social media or online networkings, smart sensors, satellites, surveillance cameras, the Internet, and endless different gadgets, big data is surrounding us. The task of the geographer is to learn that how to extract the value and sense out of big data, and see how many opportunities are knocking at your door!!
“Geography and the future of big data, big data and the future of geography” is indispensable.
: Mark Graham and Taylor Shelton, 2013.
Visited 595 times, 1 Visit today