Health IS Technology Blog

Data Mining: Improving Efficiency in Both Healthcare and Education

Data Mining in the office

What is Data Mining?

Data mining’s purpose is to “discover useful information by analyzing data from many angles or dimensions, categorize that information, and summarize the relationships identified in the database.” This definition of the process can be found in Abdulmohsen Algarni’s article “Data Mining in Education” (get the PDF version, here), as published in the seventh volume of the International Journal of Advanced Computer Science and Applications. In short, it means sifting through large chunks of data in order to find relationships, patterns, and trends within.

Big data concept art

Data mining helps to parse through all of the data accumulated by an organization.

The process helps to sift through the great amount of information that is often produced by educational institutions, healthcare, and other organizations. A major benefit of this is the ease with which informed business insights and decisions can be made. Data mining removes the chaos from your data so that it can be more easily accessed, analyzed, and understood.

The process involves the use of mathematical analyses to find patterns or trends in large amounts of data. The result of this big hunt patterns throughout big data allows us to make predictions. It can help to predict future patterns based on those already present patterns in the data. Being able to project potential future trends is extremely helpful for any organization. For example, if you’re an investor, wouldn’t you like to go back in time and know to invest in Microsoft’s stock ahead of time? gave the following four points as the basic properties of data mining:

  • Focus on large data sets and databases
  • Automatic discovery of patterns
  • Prediction of likely outcomes
  • Creation of actionable information

Important Things to Remember

To start the data mining process, you need to build “mining models”. That is, a way to collect and store information such as the patterns that may be hidden in your data. But your work doesn’t stop there. Data mining is a valuable technique. But it’s important to remember that though computer programs may be able to help you find critical patterns in your information, they can’t do all the work. You still need to be able to comprehend the value of your data alongside the needs of your organization. As you go through your data, you also need to be asking questions, not expecting it to tell you everything without provocation.

Like any effective process, data mining should begin with goal development: What do you want to learn? What is the best way to get your answers? Etc.


Data Mining in Education

Data mining can be utilized in many settings, including in the education sphere. Experts might use mining models and techniques to gather insights to better understand students. This data might concern which environments student learn best in, or how students interact with educational software. The analyses of procured data can even serve to bring light to the reasons behind student failure in order to mitigate it.

Data mining filter concept art

There are a few key principles in data mining.

Abdulmohsen Algarni defines the main steps of data mining in education as follows:

  • Discovery with models
  • Distillation of data for human judgement
  • Relationship mining
  • Clustering
  • Prediction

That “distillation of data for human judgement” is exactly what we mentioned above. No data can give answers to questions that you aren’t asking!

Relationship Mining

The most common technique in educational data mining is called, relationship mining. This type of mining looks at the relationships or connections between different variables or factors. Example that might be relevant for schools could include, student’s choice of major, age, number of visits to campus libraries, or academic advisors. You can then look at the strength of the associations between each of these variables, which in turn can help you to make better business decisions and better serve your students, faculty, etc. Here’s an example of a relationship that might be revealed through the most prevalent type of relationship mining technique, called association mining:

Example: If a student buys a Geography textbook, and is a Geography major, then the student will pass Geography 101.

As you can see, association mining is formatted in “if/then” scenarios. In this example, we see that there is a correlation or connection between a textbook purchase and success in the corresponding course. Still, your teachers were onto something when they kept reminding us of the importance of understanding “correlation versus causation”. The former, indicates any sort of connection while the latter indicates that a deeper relationship involving some cause and effect. Relationship mining is all about confirming if one variable caused another. Don’t just assume that because two variables are related they’re “together”. In this example, it’s very likely that another facet of the data could have revealed that a non-Geography major with no Geography textbook also passed Geography 101.


Data Mining in Healthcare

If educational communities have an abundance of data, you can only imagine the expanse of data coming out of healthcare organizations. Data mining is a great way to decipher this tremendous amount of information. In a feature on data mining, USF Health Online said the following:

“In healthcare, data mining has proven effective in areas such as predictive medicine, customer relationship management, detection of fraud and abuse, management of healthcare, and measuring the effectiveness of certain treatments.”

Data mining team

A very important benefit of data mining in healthcare is its ability to improve the efficiency of processes.

A very important benefit of data mining in healthcare is its ability to improve the efficiency of processes. It’s inevitable that the more efficient a healthcare provider and system can be, the less unnecessary procedures, treatments, and costs the patients will need to endure. In short, everyone wins.

Previously we discussed an example of relationship mining in education. Healthcare data can be analyzed in similar ways, usually in terms of symptoms and causes. Different treatments and patients can be compared to determine the varying effectiveness of those treatments. This, promotes efficiency and speed so that patients can get the greatest treatments possible, as soon as possible.


Potential for Data Mining in the Future and its Possible Downsides

One of the biggest concerns of data mining is its implications in terms of privacy, especially in healthcare. It can certainly become problematic when patients’ data is compiled in one place. However, as long as all required processes are followed to collect, store,  and grant access to that data securely, then that concern can be appropriately addressed.

There is so much potential for data mining to take both education and healthcare to the next level of efficiency and progress. Because it can decrease costs in both fields, it’s likely that data mining will be in increasingly high demand. In the future, the process might be the key to raising the graduation rates in higher education. It might also be the key to finding new treatment and disease prevention plans, and maybe even cures. All we can do is wait and see where our experts take this revolutionary technique to shape our future for the better.