SERVICE ITEMS
Big data, an IT industry term, refers to a collection of data that cannot be captured, managed and processed with conventional software tools within a certain period of time. It is a massive, high-growth and diversified information asset that requires new processing models to have stronger decision-making power, insight and process optimization capabilities.
Gartner, a research institute of "Big data", gives such a definition. "Big data" requires new processing mode to have stronger decision-making power, insight and process optimization ability to adapt to massive, high growth rate and diversified information assets.
The definition given by McKinsey Global Research Institute is: a data set that is large enough to exceed the capabilities of traditional database software tools in terms of acquisition, storage, management and analysis. It has four characteristics: massive data scale, rapid data flow, diverse data types and low value density. [3]
The strategic significance of big data technology is not to master huge data information, but to carry out professional processing of these meaningful data. In other words, if big data is compared to an industry, the key to achieving profit in this industry is to improve the "processing ability" of data and realize the "value-added" of data through "processing". [4]
Technically, the relationship between big data and cloud computing is as close as the front and back of a coin. Big data cannot be processed by a single computer, and must be distributed. It is characterized by distributed data mining for massive data. However, it must rely on distributed processing, distributed database, cloud storage and virtualization technology of cloud computing. [1]
With the advent of the cloud era, Big data has also attracted more and more attention. The analyst team believes that big data is usually used to describe a large amount of unstructured data and semi-structured data created by a company. These data will spend too much time and money when downloaded to a relational database for analysis. Big data analysis is often associated with cloud computing, because real-time analysis of large data sets requires a decent framework to allocate work to dozens, hundreds or even thousands of computers.
Big data requires special technology to effectively process a large amount of data that can tolerate time. Technologies applicable to big data include massively parallel processing (MPP) databases, data mining, distributed file systems, distributed databases, cloud computing platforms, the Internet and scalable storage systems.