Running head: KUCHENA CELESTINO STATISTICAL PACKAGE REVIEW Kuchena Celestino Academic Development in Doctoral Studies C. Kuchena University of Zambia GSB 8101 Dr. Rob Shah 1 KUCHENA CELESTINO STATISTICAL PACKAGE REVIEW Watching tutorials on statistical packages guided me in making a decision regarding which two I will mainly use for analysing my results. I looked at four packages (Trifecta, Rabid Miner, Jamovi, PSPP, and JASP) to decide. Each of them has its strengths and weaknesses. My choice was determined by the cost of ownership, userfriendliness, shareability of results, and range of tests possible. This paper will discuss Trifecta first, Rabid Miner next, PSPP follows then Jamovi and lastly JASP. Trifecta We are using Trifacta within our organization for multiple use cases. These include simple reporting, data profiling, issue detection, and data prep for analytics. The main benefit of this tool is that you can get at the target data very quickly compared to any programming language. Pros and Cons Data profiling Big data processing Data wrangling Automation Smart algorithms suggest data manipulation steps when you simply select data. Hadoop source data tables are presented in a flat searchable list but I would rather see them in the native hierarchy. Web interface can be flaky. We often need to refresh the page. 2 KUCHENA CELESTINO STATISTICAL PACKAGE REVIEW Rabid Miner We use RapidMiner to create ETL (Extract, Transform, Load) processes to load our BI datamarts with data from operational databases. We've created complex load processes, and we prepare that data to be fed, and later on, create Business Intelligence dashboards. We also use RapidMiner to perform some data mining with different techniques such as text processing, image processing, and algorithm data analysis (clustering, neuralnet, correlation, etc...) Pros and Cons RapidMiner is really fast at reading all kinds of databases. We read and merge databases like SQL Server, Informix, MySQL, and Oracle. Configuring access is easy, some drivers are inbuilt, but it's not difficult to find new java drivers to allow RapidMiner to connect to other databases. Performing all kinds of transformations, calculations (date, percentages...), joins, and filters without coding. We have several different databases and this makes my life a lot easier. Knowing that this part is 80% of analyst work, you know that you can work more on the analyses itself and not on cleaning and preparing data. You can clone transformations to reuse on new analyses, so you save a lot of time. There's a lot of add-ons to make different things (text, image analysis, recommender systems, etc). Training is easy, the tool is intuitive and there's a lot of videos on the internet. The community is very active. Sharing RapidMiner Studio analysis is not easy. You may think that the RapidMiner Server does that work but no. It's more automated job oriented or useful to run models on a web 3 KUCHENA CELESTINO STATISTICAL PACKAGE REVIEW site. If you need to use it for Business Analytics dashboards, this is not the tool. It's more a backstage tool for analyst. Some charts are good but other not so much. The free edition allows you to work with 10,000 rows, but if you need more, it's not cheap (100,000 rows - 2,500 USD/year, 1,000,000 rows - 5,000 USD/year). The commercial team is not very reactive. I've asked for a RapidMiner Education Program and Rapidminer Server quotation with no answer. I guess that's because they were changing from an opensource company model to a more commercial one. Return on Investment Very high positive impact because it's very fast to work with data with no coding need. You save a lot of time. Jamovi It is free One complete package for introductory statistics jamovi is a gem of a package, one that looks so good I asked the developers if they had an artist or user interface designer on the team. They don’t, but clearly, they have put a lot of thought into how to make the software beautiful and easy to use. They have also chosen their options carefully so that each analysis includes what that a researcher would want to see. Their creation of the jmv package is a bold move, one that promises to greatly simplify the number of separate packages a coder would need to learn, though in so doing they challenge 4 KUCHENA CELESTINO STATISTICAL PACKAGE REVIEW 5 the existing way of programming in R. Just as the tidyverse set of commands is controversial, the jmv package is also likely to ruffle some feathers. As nice as jamovi is, it also lacks significant features, including: the ability to see and save data management syntax; the ability to handle date/time variables; the ability to perform many more fundamental data management tasks; the ability to save new variables such as predicted values or factor scores; the ability to save models so they can be tested on hold-out samples or new data sets. JASP Jasp is a free and user friendly software which does frequency and Bayesian analysis. It also does adavanced analysis such as Sturctural Equation Modelling. It also Summary stats module allows analysing published results without needing the original data. It updates results as you go. Jasp allows copying and pasting of tables to Word in APA style. Furthermore, it permits saving of plots in formats for submission of the article.One cabn annotate results in Jasp to allow a full understanding of results when collaborating.The data sync feature allows editing of data within Jasp. This software permits publishing of results directly onto OSF. Conclusion At this point, use both. JASP does Bayesian analyses, network analysis, and SEM while JAMOVI does not do this (yet). On the other hand, JAMOVI does HLM, confirmatory factor analysis, simple mediation and moderation, equivalence testing, sample size estimation, simple main effects, and some extensions to R. Both can read SPSS and CSV files but JAMOVI can also read JASP work spaces. KUCHENA CELESTINO STATISTICAL PACKAGE REVIEW 6 I've tried both packages on MAC, Windows, and Linux ad they work perfectly. As they're both free, you've got nothing to lose except disk space. I'm using versions 9.0+ of both and I imagine great things to come. References McKiernan, P., & Tsui, A. S. (2019). Responsible management research: a senior scholar legacy in doctoral education. Academy of Management Learning and Education, 18(2), 310-313. Panigrahi, S. S., Bahinipati, B., & Jain, V. (2019). Sustainable supply chain management: a review of literature and implications for future research. Management of Environmental Quality: An International Journal. Saunders, M., Lewis, P., & Thornhill, A. (2019). Research methods for business students. Trifecta: https://www.trifacta.com/start-wrangling/ (Links to an external site.)Links to an external site. Rapid Miner https://rapidminer.com/get-started/ (Links to an external site.)Links to an external site. Jamovi https://www.jamovi.org/ (Links to an external site.)Links to an external site. Jasp https://jasp-stats.org/ (Links to an external site.)Links to an external site. KUCHENA CELESTINO STATISTICAL PACKAGE REVIEW Tutorial Trifacta Wrangler (Links to an external site.) Tutorial Rapid Miner Studio (Links to an external site.) Tutorial Jamovi (Links to an external site.) Intro to JASP (Links to an external site.) 7