doi: 10.3934/bdia.2018001
Online First

Online First articles are published articles within a journal that have not yet been assigned to a formal issue. This means they do not yet have a volume number, issue number, or page numbers assigned to them, however, they can still be found and cited using their DOI (Digital Object Identifier). Online First publication benefits the research community by making new scientific discoveries known as quickly as possible.

Readers can access Online First articles via the “Online First” tab for the selected journal.

Understanding AI in a world of big data

Environics Analytics, 33 Bloor St. East, Toronto, Ont. M4W3H1, Canada

Early access October 2018

Big Data and AI are now very popular concepts within the public lexicon. Yet, much confusion exists as to what these concepts actually mean and more importantly why they are significant forces within the world today. New tools and technologies now allow better access as well as facilitating the analysis of this data for better decision-making. But the discipline of data science with its four-step process in conducting any analysis is the key towards success in both non-advanced and advanced analytics which would, of course, include the use of AI. This paper attempts to demystify these concepts from a data science perspective. In attempting to understand Big Data and AI, we look at the history of data science and how these more recent concepts have helped to optimize solutions within this 4 step process.

Citation: Richard Boire. Understanding AI in a world of big data. Big Data & Information Analytics, doi: 10.3934/bdia.2018001
References:
[1]

Figure.1: The 5 V's of big data, Environics Analytics: Best Practices and Considerations in Big Data Analytics, June, 2018.

[2]

Figure.2: Moore's Law, https://www.google.ca/search?hl=en&tbm=isch&source=hp&biw=1366&bih=651&ei=wd3pWuPdMqqPjwSUr4SQDQ&q=exponential+growth+in+computing+power&oq=growth+in+computing+power&gs_l=img.1.1.0j0i5i30k1.4574.14021.0.16389.28.27.0.1.0.0.182.2657.16j10.26.0....0...1ac.1.64.img..1.25.2467.0..0i24k1j0i8i30k1.0.eDlGB4j2AdI#imgrc=jhm-BdlhnmB2HM:.

[3]

Figure.3: Columnar file formats, https://www.google.ca/search?hl=en&tbm=isch&source=hp&biw=1366&bih=651&ei=wd3pWuPdMqqPjwSUr4SQDQ&q=exponential+growth+in+computing+power&oq=growth+in+computing+power&gs_l=img.1.1.0j0i5i30k1.4574.14021.0.16389.28.27.0.1.0.0.182.2657.16j10.26.0....0...1ac.1.64.img..1.25.2467.0..0i24k1j0i8i30k1.0.eDlGB4j2AdI#imgrc=jhm-BdlhnmB2HM:.

[4]

Index compression, https://nlp.stanford.edu/IR-book/html/htmledition/index-compression-1.html.

[5]

Figure.6-Sequential vs. parallel data processing, https://www.google.ca/search?biw=1607&bih=678&tbm=isch&sa=1&ei=UVPwWu_uGoeYjwSkqrjwBA&q=sequential+db+processing&oq=sequential+db+processing&gs_l=img.3...0.0.0.123836.0.0.0.0.0.0.0.0..0.0....0...1c..64.img..0.0.0....0.jkNEKg1fCW0#imgdii=kH8ag2orN-LWNM:&imgrc=pBOBcUMsqlXNGM:&spf=1525699534175.

[6]

Turn to in-memory processing when performance matters, https://searchdatacenter.techtarget.com/feature/Turn-to-in-memory-processing-when-performance-matters.

[7]

Figure.8: Schematic of weights within neural net structure, https://www.google.ca/search?hl=en&tbm=isch&source=hp&biw=1366&bih=651&ei=bpvwWv2FM82O5wLqzLigCA&q=neural+net+simple+network&oq=neural+net+simple+network&gs_l=img.3...1065.21853.0.22452.38.24.0.14.14.0.120.1836.21j2.23.0....0...1ac.1.64.img..1.13.1052.0..0j0i24k1j0i10i24k1j0i10k1j0i7i30k1.0.nu7gREvNHkk#imgrc=13gO7BFb0GYZqM:.

[8]

Figure. 9-Examples of some optimization algorithms, https://www.google.ca/search?hl=en&tbm=isch&q=logistic+function&chips=q:logistic+function,g_5:logistical&sa=X&ved=0ahUKEwjw-KD5oPTaAhWkpFkKHSxSDJwQ4lYIMCgA&biw=1366&bih=651&dpr=1#imgrc=oAHIGiD5uTjw2M: https://www.google.ca/search?hl=en&tbm=isch&q=tan+function+graph&chips=q:tan+function+graph,g_1:tangent,online_chips:cos+tan&sa=X&ved=0ahUKEwjK-IGYovTaAhVQwlkKHUBnC0cQ4lYIKygC&biw=1366&bih=651&dpr=1#imgrc=gWnErav-9CIbGM:.

[9]

"Is predictive analytics for marketers really that accurate?", Journal of Marketing Analytics, May, 2013. https://link.springer.com/article/10.1057/jma.2013.8.

[10]

"Data Mining for Managers: How to use data (big and small) to solve business problems", by Palgrave Macmillan, Oct, 2014.

show all references

References:
[1]

Figure.1: The 5 V's of big data, Environics Analytics: Best Practices and Considerations in Big Data Analytics, June, 2018.

[2]

Figure.2: Moore's Law, https://www.google.ca/search?hl=en&tbm=isch&source=hp&biw=1366&bih=651&ei=wd3pWuPdMqqPjwSUr4SQDQ&q=exponential+growth+in+computing+power&oq=growth+in+computing+power&gs_l=img.1.1.0j0i5i30k1.4574.14021.0.16389.28.27.0.1.0.0.182.2657.16j10.26.0....0...1ac.1.64.img..1.25.2467.0..0i24k1j0i8i30k1.0.eDlGB4j2AdI#imgrc=jhm-BdlhnmB2HM:.

[3]

Figure.3: Columnar file formats, https://www.google.ca/search?hl=en&tbm=isch&source=hp&biw=1366&bih=651&ei=wd3pWuPdMqqPjwSUr4SQDQ&q=exponential+growth+in+computing+power&oq=growth+in+computing+power&gs_l=img.1.1.0j0i5i30k1.4574.14021.0.16389.28.27.0.1.0.0.182.2657.16j10.26.0....0...1ac.1.64.img..1.25.2467.0..0i24k1j0i8i30k1.0.eDlGB4j2AdI#imgrc=jhm-BdlhnmB2HM:.

[4]

Index compression, https://nlp.stanford.edu/IR-book/html/htmledition/index-compression-1.html.

[5]

Figure.6-Sequential vs. parallel data processing, https://www.google.ca/search?biw=1607&bih=678&tbm=isch&sa=1&ei=UVPwWu_uGoeYjwSkqrjwBA&q=sequential+db+processing&oq=sequential+db+processing&gs_l=img.3...0.0.0.123836.0.0.0.0.0.0.0.0..0.0....0...1c..64.img..0.0.0....0.jkNEKg1fCW0#imgdii=kH8ag2orN-LWNM:&imgrc=pBOBcUMsqlXNGM:&spf=1525699534175.

[6]

Turn to in-memory processing when performance matters, https://searchdatacenter.techtarget.com/feature/Turn-to-in-memory-processing-when-performance-matters.

[7]

Figure.8: Schematic of weights within neural net structure, https://www.google.ca/search?hl=en&tbm=isch&source=hp&biw=1366&bih=651&ei=bpvwWv2FM82O5wLqzLigCA&q=neural+net+simple+network&oq=neural+net+simple+network&gs_l=img.3...1065.21853.0.22452.38.24.0.14.14.0.120.1836.21j2.23.0....0...1ac.1.64.img..1.13.1052.0..0j0i24k1j0i10i24k1j0i10k1j0i7i30k1.0.nu7gREvNHkk#imgrc=13gO7BFb0GYZqM:.

[8]

Figure. 9-Examples of some optimization algorithms, https://www.google.ca/search?hl=en&tbm=isch&q=logistic+function&chips=q:logistic+function,g_5:logistical&sa=X&ved=0ahUKEwjw-KD5oPTaAhWkpFkKHSxSDJwQ4lYIMCgA&biw=1366&bih=651&dpr=1#imgrc=oAHIGiD5uTjw2M: https://www.google.ca/search?hl=en&tbm=isch&q=tan+function+graph&chips=q:tan+function+graph,g_1:tangent,online_chips:cos+tan&sa=X&ved=0ahUKEwjK-IGYovTaAhVQwlkKHUBnC0cQ4lYIKygC&biw=1366&bih=651&dpr=1#imgrc=gWnErav-9CIbGM:.

[9]

"Is predictive analytics for marketers really that accurate?", Journal of Marketing Analytics, May, 2013. https://link.springer.com/article/10.1057/jma.2013.8.

[10]

"Data Mining for Managers: How to use data (big and small) to solve business problems", by Palgrave Macmillan, Oct, 2014.

Figure 1.  [1] The 5 V's of Big Data
Figure 2.  [2] Moore's Law
Figure 3.  [3] Columnar File Format
Figure 4.  Example of Structured Data
Figure 5.  Example of Twitter Data
Figure 6.  Sequential vs. Parallel Data Processing
Figure 7.  Schematic of Simple Neural Net-One Hidden layer
Figure 8.  [7] Schematic of Weights within Neural Net Structure
Figure 9.  [8] Examples of some Optimization Algorithms
Figure 10.  Examples of Neural Nets
Figure 11.  Sample of 3 records
Figure 12.  Sample of 3 records-Fixed
Figure 13.  Frequency Distribution of Numeric Variable
Figure 14.  Frequency Distribution of Character Variable
Figure 15.  Example of Data Diagnostics
Figure 16.  Example of Alteryx Software
Figure 17.  Example of Gains/Decile Table
Figure 18.  Example of Final Model Variable Contribution Report
Figure 19.  Example of Final Model Variable Contribution Report
[1]

Yang Yu. Introduction: Special issue on computational intelligence methods for big data and information analytics. Big Data & Information Analytics, 2017, 2 (1) : i-ii. doi: 10.3934/bdia.201701i

[2]

Andreas Chirstmann, Qiang Wu, Ding-Xuan Zhou. Preface to the special issue on analysis in machine learning and data science. Communications on Pure and Applied Analysis, 2020, 19 (8) : i-iii. doi: 10.3934/cpaa.2020171

[3]

Yaguang Huangfu, Guanqing Liang, Jiannong Cao. MatrixMap: Programming abstraction and implementation of matrix computation for big data analytics. Big Data & Information Analytics, 2016, 1 (4) : 349-376. doi: 10.3934/bdia.2016015

[4]

Xiangmin Zhang. User perceived learning from interactive searching on big medical literature data. Big Data & Information Analytics, 2018  doi: 10.3934/bdia.2017019

[5]

Tieliang Gong, Qian Zhao, Deyu Meng, Zongben Xu. Why curriculum learning & self-paced learning work in big/noisy data: A theoretical perspective. Big Data & Information Analytics, 2016, 1 (1) : 111-127. doi: 10.3934/bdia.2016.1.111

[6]

Jiang Xie, Junfu Xu, Celine Nie, Qing Nie. Machine learning of swimming data via wisdom of crowd and regression analysis. Mathematical Biosciences & Engineering, 2017, 14 (2) : 511-527. doi: 10.3934/mbe.2017031

[7]

Nick Cercone, F'IEEE. What's the big deal about big data?. Big Data & Information Analytics, 2016, 1 (1) : 31-79. doi: 10.3934/bdia.2016.1.31

[8]

Jelena Grbić, Jie Wu, Kelin Xia, Guo-Wei Wei. Aspects of topological approaches for data science. Foundations of Data Science, 2022, 4 (2) : 165-216. doi: 10.3934/fods.2022002

[9]

James H. Elder. A new training program in data analytics & visualization. Big Data & Information Analytics, 2016, 1 (1) : i-iii. doi: 10.3934/bdia.2016.1.1i

[10]

Marc Bocquet, Julien Brajard, Alberto Carrassi, Laurent Bertino. Bayesian inference of chaotic dynamics by merging data assimilation, machine learning and expectation-maximization. Foundations of Data Science, 2020, 2 (1) : 55-80. doi: 10.3934/fods.2020004

[11]

Pankaj Sharma, David Baglee, Jaime Campos, Erkki Jantunen. Big data collection and analysis for manufacturing organisations. Big Data & Information Analytics, 2017, 2 (2) : 127-139. doi: 10.3934/bdia.2017002

[12]

Enrico Capobianco. Born to be big: Data, graphs, and their entangled complexity. Big Data & Information Analytics, 2016, 1 (2&3) : 163-169. doi: 10.3934/bdia.2016002

[13]

Ali Asgary, Jianhong Wu. ADERSIM-IBM partnership in big data. Big Data & Information Analytics, 2016, 1 (4) : 277-278. doi: 10.3934/bdia.2016010

[14]

Lu Xiong, Tingting Sun, Randall Green. Predictive analytics for 30-day hospital readmissions. Mathematical Foundations of Computing, 2022, 5 (2) : 93-111. doi: 10.3934/mfc.2021035

[15]

Xin Guo, Lei Shi. Preface of the special issue on analysis in data science: Methods and applications. Mathematical Foundations of Computing, 2020, 3 (4) : i-ii. doi: 10.3934/mfc.2020026

[16]

Sarai Hedges, Kim Given. Addressing confirmation bias in middle school data science education. Foundations of Data Science, 2022  doi: 10.3934/fods.2021035

[17]

Weidong Bao, Wenhua Xiao, Haoran Ji, Chao Chen, Xiaomin Zhu, Jianhong Wu. Towards big data processing in clouds: An online cost-minimization approach. Big Data & Information Analytics, 2016, 1 (1) : 15-29. doi: 10.3934/bdia.2016.1.15

[18]

Prashant Shekhar, Abani Patra. Hierarchical approximations for data reduction and learning at multiple scales. Foundations of Data Science, 2020, 2 (2) : 123-154. doi: 10.3934/fods.2020008

[19]

Roya Soltani, Seyed Jafar Sadjadi, Mona Rahnama. Artificial intelligence combined with nonlinear optimization techniques and their application for yield curve optimization. Journal of Industrial and Management Optimization, 2017, 13 (4) : 1701-1721. doi: 10.3934/jimo.2017014

[20]

Weihong Guo, Yifei Lou, Jing Qin, Ming Yan. IPI special issue on "mathematical/statistical approaches in data science" in the Inverse Problem and Imaging. Inverse Problems and Imaging, 2021, 15 (1) : I-I. doi: 10.3934/ipi.2021007

 Impact Factor: 

Metrics

  • PDF downloads (338)
  • HTML views (1908)
  • Cited by (1)

Other articles
by authors

[Back to Top]