What Is Data Science? An Introduction for and by Electrochemical Engineers

By: Neal Dawson-Elli, Seong Beom Lee, Manan Pathak, Kishalay Mitra, and Venkat R. Subramanian

This article refers to a recently published open access paper in the Journal of The Electrochemical Society, “Data Science Approaches for Electrochemical Engineers: An Introduction through Surrogate Model Development for Lithium-Ion Batteries.”

Electrochemistry and Data Science

Image via Neal Dawson-Elli
(Click to enlarge.)

Data science is often hailed as the fourth paradigm of science. As the computing power available to researchers increases, data science techniques become more and more relevant to a larger group of scientists. A quick literature search for electrochemistry and data science will reveal a startling lack of analysis done on the data science side. This paper is an attempt to help introduce the topics of data science to electrochemists, as well as to analyze the power of these methods when combined with physics-based models.

At the core of the paper is the idea that one cannot be successful treating every problem as a black box and applying liberal use of data science – in other words, despite its growing popularity, it is not a panacea. The image shows the basic workflow for using data science techniques – the creation of a dataset, splitting into training-test pairs, training a model, and then evaluating the model on some task. In this case, the training data comes from many simulations of the pseudo two-dimensional lithium-ion battery model. However, in order to get the best results, one cannot simply pair the inputs and outputs and train a machine learning model on it. The inputs, or features, must be engineered to better highlight changes in your output data, and sometimes the problem needs to be totally restructured in order to be successful.

In this paper, the most successful approach is the recurrent approach, which splits the simulated discharge curves into 5-minute chunks and allows the algorithm to estimate the voltage one minute from the current time using the previous few voltages, resulting in an average error below 0.5 percent. However, in order to more accurately predict the state of charge, or amount of energy remaining in the battery, the problem must be restructured again. The main takeaway from the paper is that in order to get the best result from a machine learning model, one must take some time to understand the system and think about non-obvious formulations.


As technical editor of the electrochemical engineering technical interest area of the Journal of The Electrochemical Society, Subramanian encourages and solicits perspective and research articles in the application of data science approaches for modeling, analysis, design, and optimization of electrochemical systems.

DISCLAIMER

All content provided in the ECS blog is for informational purposes only. The opinions and interests expressed here do not necessarily represent ECS's positions or views. ECS makes no representation or warranties about this blog or the accuracy or reliability of the blog. In addition, a link to an outside blog or website does not mean that ECS endorses that blog or website or has responsibility for its content or use.

Post Comments

Your email address will not be published. Required fields are marked *