EXAFS Modeling (Feffit / Artemis / SixPack)

How do I use constraints or restraints in my model?

Here are some contributions from the listinfo/ifeffit Ifeffit mailing list:

Bruce Ravel gives a tutorial on the use of constraints and restraints, and Matt Newville adds a bit on adding penalties to restraints.

Bruce adds: The bit about the "quiz" at the end of my post linked above is misleading. When I wrote that, I was misunderstanding one detail about how restraints work. The rest of the post is useful, however.

Scott Calvin has described a method to restrain parameters for an unknown structure to be close to those for a similar known structure, and also discusses how to relate the weighting to the uncertainty and epsilon. Scott also suggests: Also, fits with restraints are very useful to me as a diagnostic. If a fit is insisting on an S02 of 2.63, for example, I'll try restraining the S02 to 0.90 (weighted in such a way that +/- 0.20 is not too heavily penalized). If the fit then happily chooses 0.87 or something like that, I know I'm dealing with a true "false minimum." If, on the other hand, the fit pulls the S02 as high as it can given the penalty (say to 1.50) then I know it's some other kind of problem.

See also this earlier discussion of restraints from Matt.


How do I define sigma2 values for multiple scattering paths?

See these suggestions from the Ifeffit mailing list:

Shelly Kelly: http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2004-January/000634.html

Matt Newville: http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2004-January/000636.html

John Rehr: http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2004-January/000641.html

Grant Bunker: http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2004-January/000640.html

Earlier comment by John Rehr on linear paths in uranyls: http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2003-October/000520.html

and a reference from Sam Webb: http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2003-October/000521.html


I have too many variables and not enough independent data points!

See this discussion from the mailing list.


How can I determine an appropriate k-range for fitting?

Suggestions from the Ifeffit mailing list:

Shelly Kelly: http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2004-January/000623.html

Matt Newville: http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2004-January/000627.html


Why should I look at the real and imaginary parts of the FT?

One day, this question was posed on the mailing list.

Hi, i wonder if there is anyone to tell me what is the advantage of using real and 
imaginary part of  transform in data and fit, compared with magnitude plot.

The Fourier transform used in Ifeffit is a complex transform. Thus chi(R) is a complex function. The real and imaginary parts contain the same information as the magnitude and the phase, but they display it differently.

As for why the real and imaginary parts are used when Ifeffit evaluates the fit, Matt explains:

The real and imaginary parts of the Fourier transform include both the magnitude and phase 
terms.  For XAFS, the phase term is much more sensitive than the magnitude to the distance 
and atomic species of the scattering atom.  

In principle one could fit to the phase and magnitude terms instead of the real and 
imaginary terms.  My experience was (many years ago now) that this was not quite as robust 
as using the real and imaginary terms, mostly because of the 2*pi ambiguity in the phase.


How are error bars evaluated in Ifeffit and Artemis?

Here is an explanation from Bruce from the mailing list:

Matt's follow-up is very helpful:

Pages 6 through 15 of the talk that Bruce gave at the 2008 APS XAFS summer school are relevant to the last paragraph from the first link:


What are 'good' values for R and chi-squared?

The R-factor represents the relative error of the fit and data. So a value of R-factor > 0.05 often indicates a bad fit. That value is not cast in stone, but can be used as an threshold for being skeptical about a fit.

The values of chi-square and reduced chi-square are complicated by depending on the uncertainty in the data, which we estimate. If the uncertainty in the data is not specified in the fit (which is the usual case), Ifeffit estimates the uncertainty based on the white-noise in the data. This estimate works well for very noisy data, but tends to underestimate the uncertainty in good data, as it completely misses any non-stochastic components to the uncertainty (usually called 'systematic errors', meaning errors that aren't random).

Statistics 101 says that reduced chi-square should be ~1 for a 'good fit', but Statistics 101 assumes that a) you know the uncertainty in the data, b) you know how many independent measurements (ie, how much data) you have, and c) the model has no error in it. For EXAFS, these can all be questioned. Experience says that a reduced chi-squares around 10 are common for excellent fits to data of well-characterized standards, such as a metal foil or powdered metal oxide. This is generally attributed to a combination of a) and c) above (that the data is noisier than the white-noise suggests and that, though Feff is good, it is not perfect).


How do I compare different fits? R-factor? Chi-square?

A common task in analysis is to compare two different fits to determine which is the better fit. It's not easy to compare vastly different fits (say with completely different models to completely different data), so I'll ignore the question Is this fit to 2 shells of TiO2 better than that fit to Zn in solution?, and focus on how to compare fits with similar models and similar data.

First, R-factor (r_factor) should be the preferred statistic to tell if a fit is "good". This measures the misfit relative to the data: a fractional misfit. A value below 0.05 generally means a pretty good fit. In addition, it is important to look at the fit in R-space and to inspect whether the k- and R-ranges used in the fit are appropriate.

Chi-square (chi_square) is the misfit relative to the estimated uncertainty (epsilon_r for R-space fits). Because the uncertainty is difficult to estimate, this chi-square is usually much higher than the number of free parameters in the fit (nu=n_idp-n_varys where n_idp is the number of independent points in that the data can support and n_varys is the number of variables used) which is what a statistics book will tell you it should be for a good fit.

Either of these parameters can be used to compare different fits if the number of varibles and data k- and R-ranges are the same for the two models.

As you add more variables to a model, the misfit should go down, and both chi-square and R-factor should become smaller. But that alone does not mean the fit is "better": you could keep adding variables just to make the fit a little better even if they make a tiny improvement and have no physical meaning. The generally accepted standard is that reduced chi-square (chi_reduced) is the principle statistic to use for comparing fits as this takes into account the number of varibles (n_varys) and the number of independent data points (n_idp) which takes the k- and R-ranges into account.

The simplest approach is then to use which fit has the lowest reduced chi-square? to be equivalent to which fit is better? As an important example, reduced chi-square can be used in this way to determine whether an additional variable is warrented.

Strictly speaking, the "which chi_reduced is smaller" test is not a sufficient statistical test, and more elaborate chi-square tests can be performed to test the probability that one fit is better than another. In practice, when these two methods disagree, it means the data cannot easily distinguish the two models.


What is the meaning of S02?

See John Rehr's contribution to the Ifeffit mailing list: http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2003-February/000230.html


How do I measure coordination numbers?

From the Ifeffit mailing list archives:

Bruce's answer: http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2002-October/000151.html

Scott's comment: http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2002-October/000152.html

Matt's comment: http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2002-October/000153.html

Bruce again: http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2002-October/000154.html

John's Comment: http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2002-October/000155.html


What is the 'imaginary energy' correction (Ei)?

See Bruce Ravel's discussion of Ei from the Ifeffit mailing list: http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2004-February/000699.html

Ei affects the amplitude component, and this is discussed in the FAQ (in this section) treating S02.


How do I select k-weighting for my data?

The k-weight should not substantially affect the results of your fit, or if it does then something is wrong. However, it will affect the correlations between your model parameters.

Ifeffit and Artemis allow you to fit your model simultaneously to data weighed by several k-weights, and this is recomended in order to break the correlations.

See the following suggestions from the Ifeffit mailing list:

Scott Calvin: http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2004-February/000706.html

Shelly Kelly: http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2004-February/000705.html

Bruce Ravel discusses the limitations of selecting a single k-weight based on backscatterer mass: http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2004-February/000707.html


How can I assign different kmin/kmax to multiple data sets in Ifeffit?

From Matt's posting to the Ifeffit mailing list: http://millenia.cars.aps.anl.gov/pipermail/ifeffit/2004-April/000787.html

You can specify different kmin,kmax,rmin, and rmax parameters for each data set. A simple example would be:

  read_data(file = my_data.chi, group=dat, type=chi)
  guess(Amp= 0.9, E0= 0,  dR1= 0.00)
  path(1, feff= feff0001.dat, s02= Amp, e0= E0, delr= dR1)

  set(kmin= 2, kmax= 12, dk= 1, rmin= 1, rmax= 3, kweight= 2)

  feffit(chi= dat.chi, kmax= 10, kweight= 1, 1, data_set= 1, data_total= 2)
  feffit(chi= dat.chi, kmax= 11, kmin= 3,    1, data_set= 2, data_total= 2)

To address your question, this example uses both 'globally set' FT parameters, and ones explicitly set in the feffit() command. For the first data set, kmin, dk, rmin, rmax are taken from the 'global parameters', while the explicit values for kmax and kweight are used. For the second data set, kmax and kmin are explicit, and dk= 1,rmin= 1,rmax= 3, and kweight= 2, as set from the global values.

An important note is that after feffit() runs completely, the global program variable kmax will be set to the last used value - 11.0 in this case. This is NOT true if feffit() has data_set<data_total. Thus, for the second data set above, kweight= 2, not 1.

The use of global parameters might be confusing (but it made sense to me at the time!). They can be a convenient at the command line or with quickly-writen scripts, but you need to be able to overwrite them, and that can lead to confusion. The peculiar behavior of feffit() for multiple-data-set fits is the result of not having an obvious "right way" of doing it. You can explicitly state all the FT and range parameters. This avoids confusion, but does take more typing. I'm pretty sure this is what Artemis does which makes sense since it's "doing the typing" for you.

Finally, the feffit() keywords that are applied to a data set are:

  rmin rmax kmin kmax dk dk1 dk2 kwindow kweight 
  epsilon_k epsilon_r fit_space  data_set chi k

oh, and the path list goes with each data set of course.


How small a k-range can I use?

Here are some useful suggestions from the Ifeffit mailing list.

Scott Calvin discusses the limitation that k-range imposes on spacial resolution, while Matt talks about how low kmin can be.

Shelly gives her usual very practical advice for determining a suitable k-range, with an addendum from Scott


How do I handle doped materials? Why doesn't Atoms handle doped materials?

Before I launch into this, here are some useful links of this topic:

Atoms is, except in extremely contrived situations, not capable of writing a proper feff.inp file for a doped material. This is not a programming shortcoming of Atoms, but a number theoretic limitation imposed by the physical model used by FEFF.

In the feff.inp file, there is a big list of atomic coordinates. The reason that people like using Atoms is because, without Atoms, it is a pain in the ass to generate that list. The virtue of Atoms is that it automates that annoying task for a certain class of matter, i.e. crystals.

FEFF expects a point in space to be either unoccupied or occupied by a specific atom. A given point may be occupied neither by a fraction of an atom nor by two different kinds of atoms.

Let's use a very simple example -- gold doped into fcc copper. In fcc copper, there are 12 atoms in the first shell. If the level of doping was, say, 25%, then Atoms could reasonably use a random number generator to choose three of the 12 first neighbor sites and replace them with gold atoms. However, what should Atoms do with the second shell, which contains 6 atoms? 25% of 6 is 1.5. Feff does not allow a site to be half occupied by an atomic species, thus Atoms would have to decide either to over-dope or under-dope the second shell.

This problem only gets worse if the doping fraction is not a rational (in the number theory sense) fraction, if the material is non-isotropic, or if the material has multiple sites that the dopant might want to go to.

Because Atoms cannot solve this problem correctly except in the most contrived of situations, I decided that Atoms would not attempt to solve it in any situation. If you specify dopants in Atoms' input data, the list in the feff.inp file will be be made as if there are no dopants.

This leads to two big questions:

  1. Why are dopants allowed in Atoms at all?
  2. How does one deal with XAS of a doped sample?

The first question is the easy one. Atoms can do other things besides generating feff.inp files. Calculations involving tables of absorption coefficients and simulations of powder diffraction and DAFS spectra make use of the dopant information.

The second question is the tricky one and the answer is somewhat different for EXAFS as for XANES. The chapter in the PDF file mentioned at the beginning of this FAQ question discusses one approach to analyzing EXAFS of doped materials. Scott discusses a few more. The bottom line is that you need to be creative and willing to run Feff more than once.

The best approach to simulating a XANES spectrum on a doped material that I am aware of also involves running FEFF many times. One problem a colleague of mine asked me about some time ago was the situation of oxygen vacancies in Au2O3. After some discussion, the solution we came up with was to use Atoms to generate the feff.inp for the pure material. My friend then wrote a little computer program that would read in the feff.inp file, randomly remove oxygen atoms from the list, write the feff.inp file back out with the missing oxygens, and run FEFF. He would do this repeatedly, each time replacing a different set of randomly selected atoms and each time saving the result. This set of computed spectra was then averaged. New calculations were made and added to the running average until the result stopped changing. If I remember, it took about 10 calculations to converge.

This random substitution approach would work just as well for dopants as for vacancies.


How do I add a new FAQ entry?

Edit this section, use copy and paste! Use a format like this, using the same level of header (i.e. the same number of = signs) and keeping the link back to the contents at the top of the page:

== How do I add a new FAQ entry? ==
     Edit this section, use copy and paste! 
[#contents Contents]


FAQ/FeffitModeling (last edited 2010-02-25 01:55:15 by BruceRavel)