Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some basic questions about using and presenting the results #5

Open
jianzuo opened this issue Aug 29, 2024 · 9 comments
Open

Some basic questions about using and presenting the results #5

jianzuo opened this issue Aug 29, 2024 · 9 comments

Comments

@jianzuo
Copy link

jianzuo commented Aug 29, 2024

Hi,
I am trying to interpret some impedance spectra using your package.
As a new user, I followed the procedure described in ``Fitting EIS data.ipynb'' and
I have successfully obtained the drt inversion results.

Based on this, I have some basic questions about the DRT optimization and results plotting:

  • For a new impedance spectrum, what are the possible adjustable optimization parameters in the current dual inversion algorithm? For an unknown electrochemical system, the true drt is unknown and I want to make sure the obtained results are as accurate as possible.
  • Does the following code block give the final optimal results?
# Get the best discrete candidate identified by the dual algorithm
best_id = dual.get_best_candidate_id('discrete', criterion='bic')
best_model_dict = dual.get_candidate(best_id, 'discrete')
best_model_dict

For the final drt, how to visualize the peak separating plot: the individual peak distributions over the total distribution?

Besides the time constant, is it possible to obtain the frequency range of the identified peaks? (this is useful information for some application scenarios)

And is there a way to export the final drt and peak separation results data for further plots rather than he given plot functions?

I am trying to make sure I understand and present things correctly since I am using it for formal work. Thank you in advance for your help!

@jianzuo
Copy link
Author

jianzuo commented Sep 3, 2024

In my case, I have multiple impedance spectra measured from actual energy systems. The true distribution functions are thus unknown.
And some curves seem difficult to get the optimal results. For example, for the following criteria figure, how to decide the optimal number of peaks?
image

Concerning the warning:

UserWarning: Solution did not converge within 10 iterations
  warnings.warn(f'Solution did not converge within {max_iter} iterations')

Should we increase the number of interactions in practice? But I did not find out how.

My last observation is that the best model obtained from using RQ and HN can be significantly different, do you have some suggestions on which one to use? Or could this be improved by further optimizing the tunable parameters?

@jdhuang-csm
Copy link
Owner

Hi,

These are all good questions. I believe that some of them are addressed in the tutorial already, so I will point you to the relevant section when possible.

  • Q: For a new impedance spectrum, what are the possible adjustable optimization parameters in the current dual inversion algorithm? For an unknown electrochemical system, the true drt is unknown and I want to make sure the obtained results are as accurate as possible.
    • A: There are two components of the dual inversion algorithm to consider. The first is the hierarchical Bayesian inversion algorithm that is applied to estimate the continuous (conventional) DRT. This algorithm is applied with a range of regularization strengths to obtain many possible DRTs. This step already considers a wide range of possible hyperparameters to account for the fact that the optimal hyperparameters are never known a priori. However, you may obtain slightly different results by adjusting the hyperparameter l2_lambda_0, which is the overall ridge regression penalty strength. The default value is 100; larger values will tend to suppress DRT peaks, while smaller values will allow more variation in the DRT. The second step is the conversion of the continuous DRTs to discrete circuit models. In this step, the set of continuous DRTs from the first step are searched for peaks, and the identified peaks are converted to circuit elements. Probably the most importat parameters for this step are the peak-finding parameters. You can adjust these by providing a dict of keywords that are ultimately provided to scipy.signal.find_peaks. For example, to search for peaks with prominence greater than 0.01 and a height of at least 0.05, you can provide the following to dual_fit_eis: generate_kw={"find_peaks_kw":{"prominence": 0.01, "height": 0.05}}. Finally, if you find that the generated candidates have more peaks than expected - e.g. a pseudo-peak appears even in the simplest discrete model - you can generate simpler models by sequentially removing the smallest peaks from the generated models. To do so, add the min_fill_num argument to the generate_kw dict: generate_kw={"min_fill_num": 2} - in this example, the algorithm will ensure that a model with only 2 peaks is generated, even if all generated candidates appear to have more peaks.
  • Q: Does the following code block give the final optimal results?
    • A: If we take the BIC as the most reliable metric for model selection, then the code block you included should be optimal. It is, of course, difficult to decide what is 'optimal' or which metric is most robust. If you refer to the publication that introduces the dual inversion algorithm, you can find more information on the behavior and nuances of the different model selection metrics. In short, both BIC and LML have strengths and weaknesses. Considering both is likely wise, noting that when the two metrics disagree, there is probably significant uncertainty about which model is most suitable.
  • Q: For the final drt, how to visualize the peak separating plot: the individual peak distributions over the total distribution?
    • A: Please see the section of the tutorial titled "Separating and quantifying peaks".
  • Q: Besides the time constant, is it possible to obtain the frequency range of the identified peaks? (this is useful information for some application scenarios)
    • A: The characteristic frequency is given by inverting the time constant with a factor of 2*pi, i.e. f_c = 1 / (2 * pi * tau_0).
  • Q: And is there a way to export the final drt and peak separation results data for further plots rather than the given plot functions?
    • A: The tutorial section "Accessing fit parameters and quantities" describes how you can access the continuous distribution as a numpy array. If you would like to access the results from the discrete element model, you can use the following methods (assuming that model is the DiscreteElementModel instance of interest, and tau_grid is the grid of time constants at which you want to evaluate the DRT):
      model.predict_distribution(tau_grid) - to get the total DRT
      model.predict_element_distribution(tau_grid, i) - to get the DRT of the ith element (elements are sorted from shortest time constant to longest)
      If you need to export the distributions to files, look at numpy.savetxt or consider using pandas.

@jdhuang-csm
Copy link
Owner

For your second set of questions:

  • Q: Some curves seem difficult to get the optimal results. For example, for the following criteria figure, how to decide the optimal number of peaks?
    • A: Either 3 or 4 peaks are feasible in this case. With experimental data, as you noted, we never know the true answer. We can use statistics to guide our interpretation, but it cannot always give us a definitive answer. In this case, considering other measurements of the same system, as well as the possible physical processes, could help.
  • Q: Should we increase the number of interactions in practice? But I did not find out how.
    • A: This warning is not a problem, especially if you are using the dual fit algorithm. There is no need to increase the number of iterations. This warning actually should ideally be disabled for dual_fit_eis, so I will update the code.
  • Q: My last observation is that the best model obtained from using RQ and HN can be significantly different, do you have some suggestions on which one to use? Or could this be improved by further optimizing the tunable parameters?
    • A: Certainly, these can differ since they describe different phenomena. I would recommend considering which element is more likely to describe your system. The HN element is more prone to overfitting and parameter correlations due to its ability to describe asymmetric peaks. I would suggest using the RQ element if you don't have reason to believe that an HN element is necessary.

@jianzuo
Copy link
Author

jianzuo commented Sep 5, 2024

Thank you for your professional and informative response.
These are very helpful!
I will let you know if I have further questions, thank you!

@jianzuo
Copy link
Author

jianzuo commented Sep 6, 2024

I still have problems regarding the results of the final total DRT and the decomposed peaks (I use RQ as the base element in the ECM), specifically:

  1. Q: How to retrieve the generated continuous candidates ($\gamma$ and the corresponding regularization strength), so that I can construct the figure as presented in Fig. 3a) in your paper (https://doi.org/10.1016/j.electacta.2023.141879)?

  2. Q: I got a DRT curve of the best model (using the dual algorithm best_model.plot_eis_fit) as follows:
    image
    The best_model.parameter_dict gives {'R_R0': np.float64(1.7253759505369899), 'lnL_L0': np.float64(-13.543433335040868), 'R_RQ1': np.float64(0.06701622751033791), 'lntau_RQ1': np.float64(-11.729064813813391), 'beta_RQ1': np.float64(0.9010920700027231), 'R_RQ2': np.float64(0.1570738484226259), 'lntau_RQ2': np.float64(-6.45271670479093), 'beta_RQ2': np.float64(0.5613884997056322), 'R_RQ3': np.float64(0.1745053723192633), 'lntau_RQ3': np.float64(-4.720535433678493), 'beta_RQ3': np.float64(0.7796951837918176), 'R_RQ4': np.float64(0.03638181338700983), 'lntau_RQ4': np.float64(-0.2761498628368765), 'beta_RQ4': np.float64(0.9999999999999999), 'R_RQ5': np.float64(-0.11434509195485565), 'lntau_RQ5': np.float64(0.922638854832915), 'beta_RQ5': np.float64(0.9999999999999999)}.

When I try to recover the obtained data as you have pointed out using the following codes:

tau_plot = np.logspace(-8, 2, 1001)
gammas = best_model.predict_distribution(tau_plot)
labels = ['', '', 'p1', 'p2', 'p3', 'p4', 'p5']
plt.plot(tau_plot, gammas, label='DRT')
plt.xscale('log')
for i in range(2, 7):
    element_gamma = best_model.predict_element_distribution(tau_plot, i)
    plt.plot(tau_plot, element_gamma, label=labels[i])
    plt.legend()
    plt.xscale('log')

I got
image
I noticed that peaks 4 and 5 are not visible, I plotted them separately, and then I obtained the following:
image
This is strange and the total DRT in the second plot is different from the first figure.
How to get the correct DRT and the DRT of the i-th element results?

  1. Q: Related to this, I am trying to understand the peak identification and quantification algorithm you have used in the continuous model.
    I see you have explained the peak identification method in your paper, but I did not see any explanation for the quantification of the identified peaks (I mean the fitting of the peak and calculate the area under the peak). As a beginner in DRT analysis, this is not clear to me.

  2. Q: In the dual regression-classification algorithm, I am confused about the source of the final results after running best_model.plot_eis_fit(axes=axes[1]):

  • The total DRT is given by the continuous regression model, corresponding to the optimal Q (the number of RQ elements in the equivalent circuit)?

  • For the final individual peak (I consider it the same as you have mentioned above, the DRT of the i-th element), are they given by the discrete model, i.e. from the ECM?

  • The final results given by best_model.parameter_dict, are they come from the discrete ECM model?

Thank you in advance for the help!

@jdhuang-csm
Copy link
Owner

Hi, sorry for the delayed reply!

  1. This is actually generated during a PFRT fit (drt.pfrt_fit_eis). I have added the code to generate the plot you mentioned to the tutorial notebook in the PFRT section.

  2. The reason that peaks 4 and 5 are not visible in your plots is that they are fitted as nearly ideal RC elements (note that the dispersion parameter beta is very close to 1 for both elements). Thus, the analytical DRT for each of these elements is a Dirac delta function, which is infinitely narrow (i.e. gamma=inf for tau=tau_0, 0 otherwise). When you use model.plot_distribution or model.plot_element_distribution, the lines corresponding to singular elements like ideal RC elements are added to the plot separately as vertical lines. When you use model.predict_element_distribution, the returned array will be (nearly) zero everywhere unless your tau grid happens to contain a point extremely close to tau_0. Thus, I would recommend using model.plot_element_distributions, which takes an optional argument element_names, in which you can specify the list of elements that you want to plot (i.e. element_names=["RQ1"] to plot only P1). Otherwise, you can manually plot vertical lines at tau=tau_0 for each singular element.

  3. Can you clarify which function you are referring to?

  4. The object best_model is a discrete model, which contains no information about the continuous model from which it was generated (except the initial guess parameters used to initialize the fit). The total DRT is given by the sum of the DRT contributions of the elements contained in the model. The individual peaks and parameter values in parameter_dict also correspond to the discrete model.

@jianzuo
Copy link
Author

jianzuo commented Sep 16, 2024

Hi,
Thank you for your time and very detailed response to my questions!

  • Regarding Q2, I mean how the polarization resistance of the identified peak is estimated (for continuous models) in hybrid-drt? As you have mentioned in response 3, these values are used as initial values for fitting discrete ECMs.

Thanks!

@jdhuang-csm
Copy link
Owner

I see, thanks for clarifying. After peak positions have been identified, the individual DRT contribution of each peak is estimated. This is done by first identifying troughs or valleys between peaks, then assigning a weighting function to each peak that decays with a characteristic length scale determined by the distance to the trough. The weighting functions for all peaks are then used to assign a fraction of the DRT at each timescale to each peak. The individual peak distributions can then be integrated to obtain peak polarization resistances, and circuit parameters (such as dispersion or asymmetry) can be estimated from the distributions as described in the paper.

You can see the result of this non-parametric peak deconvolution process by using drt.plot_peak_distributions (and you can obtain the peak distribution arrays using drt.estimate_peak_distributions). If you'd like to examine the details of the algorithm, you can check the function hybdrt.peaks.estimate_peak_weight_distributions.

@jianzuo
Copy link
Author

jianzuo commented Sep 21, 2024

Hi,
Thank you very much for the detailed explanation.
I am going to check them one-by-one.
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants