Latest Computational Methods of Data Wrangling in Applied Linguistics

##plugins.themes.academic_pro.article.main##

Minnaa Ahmad
Aqsa Shereen
Muhammad Shoaib Tahir

Abstract

Data wrangling, the process of cleaning, transforming, and mapping raw data into a usable format, is a fundamental step in linguistic research. With the exponential growth of digital text and spoken language data, computational methods have become essential for managing and analyzing large datasets. This paper explores the latest computational techniques in data wrangling for linguistics, highlighting advancements in machine learning, natural language processing (NLP), and automated data cleaning technologies. These innovations enhance the efficiency and accuracy of data processing, enabling researchers to uncover patterns and generate insights previously unattainable. Additionally, the paper addresses the challenges inherent in data wrangling, including the integration of diverse data sources, the complexity of real-time data processing, and the importance of maintaining data quality and compliance with privacy regulations. Through a detailed examination of current methodologies and emerging trends, this research underscores the critical role of data wrangling in advancing linguistic studies and offers practical solutions for overcoming common obstacles in the field. The findings suggest that continued advancements in AI-driven tools and automated solutions will significantly impact the future of linguistic data analysis, making it more accessible and effective.

##plugins.themes.academic_pro.article.details##

How to Cite
[1]
Ahmad, M. , Shereen, A. and Tahir, M.S. 2024. Latest Computational Methods of Data Wrangling in Applied Linguistics. Journal of Policy Research. 10, 2 (Jun. 2024), 718–725. DOI:https://doi.org/10.61506/02.00290.

References

  1. Chapelle, C. A. (2014). Teaching culture in introductory foreign language textbooks. Palgrave Macmillan.
  2. Cook, V. (2013). Second Language Learning and Language Teaching. Routledge. DOI: https://doi.org/10.4324/9780203770511
  3. Creswell, J. W. (2014). Research design: Qualitative, quantitative, and mixed methods approaches. Sage publications.
  4. Dörnyei, Z. (2007). Research methods in applied linguistics. Oxford University Press.
  5. Ellis, R. (2012). Language teaching research and language pedagogy. Wiley-Blackwell. DOI: https://doi.org/10.1002/9781118271643
  6. Gass, S. M., & Mackey, A. (2015). The Routledge handbook of second language acquisition. Routledge.
  7. Mackey, A., & Gass, S. M. (2015). Second language research: Methodology and design. Routledge.
  8. Ortega, L. (2013). Understanding second language acquisition. Routledge. DOI: https://doi.org/10.4324/9780203777282
  9. Ricento, T. (2006). An introduction to language policy: Theory and method. Blackwell Publishing.
  10. Chen, L., et al. (2014). Big Data: Related Technologies, Challenges and Future Prospects. Springer. DOI: https://doi.org/10.1007/978-3-319-06245-7
  11. Crawford, K., Gray, M. L., & Miltner, K. (2014). Big Data, Big Questions| Critiquing Big Data: Politics, Ethics, Epistemology. International Journal of Communication, 8, 10.
  12. Dasu, T., & Johnson, T. (2003). Exploratory Data Mining and Data Cleaning. Wiley. DOI: https://doi.org/10.1002/0471448354
  13. Dikaiakos, M. D., Katsaros, D., Mehra, P., Pallis, G., & Vakali, A. (2005). Cloud computing: Distributed Internet computing for IT and scientific research. IEEE Internet Computing, 9(6), 10-13. DOI: https://doi.org/10.1109/MIC.2009.103
  14. Domingos, P. (2015). The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World. Basic Books.
  15. Grolinger, K., Hayes, M., Higashino, W. A., L'Heureux, A., Allison, D. S., & Capretz, M. A. M. (2013). Challenges for MapReduce in big data. 2013 IEEE World Congress on Services, 182-189. DOI: https://doi.org/10.1109/SERVICES.2014.41
  16. Halevy, A., Norvig, P., & Pereira, F. (2009). The Unreasonable Effectiveness of Data. IEEE Intelligent Systems, 24(2), 8-12. DOI: https://doi.org/10.1109/MIS.2009.36
  17. Herschel, R. (2016). Data Integration: Challenges and Solutions. Journal of Business & Economics Research (JBER), 14(3), 83-92. DOI: https://doi.org/10.19030/jber.v14i3.9748
  18. Kandel, S., Paepcke, A., Hellerstein, J. M., & Heer, J. (2011). Wrangler: Interactive visual specification of data transformation scripts. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 3363-3372). DOI: https://doi.org/10.1145/1978942.1979444
  19. Kitchin, R. (2014). The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences. SAGE Publications Ltd. DOI: https://doi.org/10.4135/9781473909472
  20. Manyika, J., et al. (2011). Big data: The next frontier for innovation, competition, and productivity. McKinsey Global Institute.
  21. McKinney, W. (2010). Data structures for statistical computing in Python. Proceedings of the 9th Python in Science Conference, 51-56. DOI: https://doi.org/10.25080/Majora-92bf1922-00a
  22. McKinney, W. (2017). Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython. O'Reilly Media.
  23. Provost, F., & Fawcett, T. (2013). Data Science for Business: What You Need to Know About Data Mining and Data-Analytic Thinking. O'Reilly Media.
  24. Rahm, E., & Do, H. H. (2000). Data cleaning: Problems and current approaches. IEEE Data Engineering Bulletin, 23(4), 3-13.
  25. Russell, S., & Norvig, P. (2016). Artificial Intelligence: A Modern Approach. Pearson.