Tutorial for Stacked Trapezoid model
This tutorial shows how to do simulations for stacked trapezoid model and how to fit the experimental data using this model.
[1]:
from cdsaxs.simulations.stacked_trapezoid import StackedTrapezoidSimulation
import numpy as np
Simulation
Prepare the data
[2]:
pitch = 100 #nm distance between two lines
qzs = np.linspace(-0.5, 0.5, 121)
qxs = 2 * np.pi / pitch * np.ones_like(qzs)
#parameters
dwx = 0.1
dwz = 0.1
i0 = 10
bkg = 0.1
y1 = 0.
height = 20. #same as [20., 20.]
bot_cd = 40.
swa = [80., 80.0]
langle = np.deg2rad(np.asarray(swa))
rangle = np.deg2rad(np.asarray(swa))
#simulation data
i_params = {'heights': height,
'langles': langle,
'rangles': rangle,
'y_start': y1,
'bot_cd': bot_cd,
'dwx': dwx,
'dwz': dwz,
'i0': i0,
'bkg_cste': bkg,
}
Description of the parameters:
dwx, dwz, i0 and bkg are the parameters necessary to calculate the Debye-waller factor which accounts for the real world imperfections in the model. You can read more about it here.
The other parameters are the geometrical parameters of the model. The parameters are as follows:
y_start - starting y-coordinate of the base of the nano structure
bot_cd - bottom width (CD for critical dimension) of the nano structure
heights - height of each individual trapezoid can be a single value or a list of values in case of list of values, each value corresponds to the height of each individual trapezoid.
langles - list of all the left bottom angles of each individual trapezoid. The dictionary which is passed should have it in radians.
rangles - list of all the right bottom angles of each individual trapezoid. The dictionary which is passed should have it in radians.
weight - weight of each individual trapezoid to account for the fact that they could be made of different materials. here we will assume that all the trapezoids are made of the same material hence the weight is not necessary.
Note: In symmetric case either left or right angle can be passed and the other will be calculated using the symmetry.
So we are constructing a line space pattern and the cross section of each line looks like following:

Create instance of the Simulation class and call simulate_diffraction
method
[3]:
Simulation1 = StackedTrapezoidSimulation(qys=qxs, qzs=qzs)
intensity = Simulation1.simulate_diffraction(params=i_params)
#plot
import matplotlib.pyplot as plt
plt.plot(qzs, intensity)
plt.xlabel(r'$q_{z}$')
plt.ylabel('Intensity')
plt.title('Stacked Trapezoid diffraction simulation')
[3]:
Text(0.5, 1.0, 'Stacked Trapezoid diffraction simulation')

Fitting
A txt or csv file with the experimental data containing the values for \(Q_{x}\), \(Q_{z}\) and intensities can be in the format as shown below:
```
qx, qz, intensity
0.1, 0.2, 0.3
0.2, 0.3, 0.4
0.3, 0.4, 0.5
...
```
for a text file you can read the data using the following code:
import numpy as np
data = np.genfromtxt('path_to_file.txt', delimiter=',', skip_header=1)
qx = data[:,0]
qz = data[:,1]
intensity = data[:,2]
We will suppose that we’ve read the data and stored it in the variables qx, qz and intensity. And we’ll use the simulated intensities used in the previous section.
Let’s introduce a bit of noise to the simulated intensities to make it more realistic.
[4]:
import matplotlib.pyplot as plt
intensity_noisy = intensity + np.sqrt(intensity) * np.random.normal(0, 300, intensity.shape)
plt.plot(qzs, intensity, label='original')
plt.plot(qzs, intensity_noisy, label='added noise')
plt.ylabel('Intensity')
plt.xlabel('qz')
plt.legend()
[4]:
<matplotlib.legend.Legend at 0x700729f7f1d0>

Now that we have our experimental data good to go for the fitting we need to provide a list of initial values and bounds for the parameters. The bounds are necessary to make sure that the fitting algorithm doesn’t lose its way looking for impossible values.
[5]:
initial_params = {'heights': {'value': height+2, 'variation': 10E-1},
'langles': {'value': langle-0.2, 'variation': 10E-1},
'rangles': {'value': rangle-0.50, 'variation': 10E-1},
'y_start': {'value': y1+1, 'variation': 10E-1},
'bot_cd': {'value': bot_cd-1, 'variation': 10E-2},
'dwx': {'value': dwx, 'variation': 10E-5},
'dwz': {'value': dwz, 'variation': 10E-5},
'i0': {'value': i0, 'variation': 10E-5},
'bkg_cste': {'value': bkg, 'variation': 10E-5}
}
Create an instance of the Simulation class and pass it to the Fitter class along with data to fit
[6]:
from cdsaxs.fitter import Fitter
Simulation2 = StackedTrapezoidSimulation(use_gpu=False, qys=qxs, qzs=qzs, initial_guess=initial_params)
Fitter1 = Fitter(Simulation=Simulation2, exp_data=intensity)
Then we’ll call the cmaes
method of the Fitter class to fit the data. It’s important to know that this method provides a normalized set of values between -sigma and sigma and it is rescaled to the parameter values according to the model used.
sigma - the standard deviation of the values generated by the cmaes
method.
Note: Similar to bound denoted by variation the parameter sigma helps to control the exploration of the algorithm. The larger the value of sigma the further the algorithm explores the parameter space and vice versa. Thus they need to be adjusted for the algorithm to converge to the optimal solution.
ngen - the number of generations to run the algorithm
popsize - the population size
mu - the number of parents to select for the next generation
n_default - the number parameters to fit
restarts - the number of times to restart the algorithm
tolhistfun - the tolerance for the history of the function value
ftarget - the target fitness value
restart_from_best - whether to restart from the best solution found so far
verbose - whether to print the progress of the algorithm
dir_save - the directory to save the results of the fitting
[7]:
bestfit, fitness = Fitter1.cmaes(sigma=1, ngen=100, popsize=500, mu=10, n_default=11, restarts=1, tolhistfun=10E-5, ftarget=0, restart_from_best=True, verbose=False)
print(bestfit)
Iteration terminated due to ngen criterion after 100 gens
Doubled popsize
Iteration terminated due to ngen criterion after 100 gens
[]
height1 langle1 langle2 rangle1 rangle2 y_start bot_cd \
0 19.999831 1.397397 1.39493 1.395469 1.397299 0.15517 39.996793
dwx dwz i0 bkg_cste
0 0.099187 0.100135 9.999407 0.099898
/nobackup/nd276333/Workspace/cdsaxs/src/cdsaxs/fitter.py:290: RuntimeWarning: invalid value encountered in scalar divide
if(n_infs / fitness_arr.shape[0] > 0.5):
Now we have the bestfit parameters in the form of a pandas dataframe. We will look at the statistical information generated by the mcmc_bestfit_stats
method.
[8]:
with np.errstate(divide='ignore', invalid='ignore', over='ignore'):
stats = Fitter1.mcmc_bestfit_stats(N=11, sigma = 1., nsteps=100, nwalkers=500)
print(stats)
11 parameters
0%| | 0/100 [00:00<?, ?it/s]100%|██████████| 100/100 [00:02<00:00, 41.05it/s]
mean std count min max lower_ci \
height1 20.036754 0.244560 28695 18.724427 23.587721 20.033035
langle1 1.462781 0.235986 28695 0.056644 3.139290 1.459192
langle2 1.428477 0.225850 28695 0.060566 3.096835 1.425043
rangle1 1.442650 0.239630 28695 0.011439 3.079889 1.439006
rangle2 1.422168 0.236631 28695 0.036405 2.946014 1.418570
y_start 0.606899 0.224636 28695 0.001792 3.918192 0.603483
bot_cd 39.989187 0.037260 28695 39.633367 40.461651 39.988620
dwx 0.099176 0.000036 28695 0.098975 0.099463 0.099175
dwz 0.100136 0.000026 28695 0.099906 0.100385 0.100136
i0 9.999419 0.000030 28695 9.999232 9.999656 9.999419
bkg_cste 0.099896 0.000025 28695 0.099704 0.100075 0.099895
upper_ci uncertainity
height1 20.040473 3.719023e-03
langle1 1.466369 3.588627e-03
langle2 1.431912 3.434493e-03
rangle1 1.446294 3.644053e-03
rangle2 1.425767 3.598446e-03
y_start 0.610315 3.416029e-03
bot_cd 39.989753 5.666087e-04
dwx 0.099176 5.474902e-07
dwz 0.100137 3.993230e-07
i0 9.999420 4.615437e-07
bkg_cste 0.099896 3.858098e-07