Parsing
These functions are used within the various data file and experiment classes and it's unlikely you'll need them while performing data analysis using magnetopy
. Future development of new classes for new types of data files or new experiments may find these useful, though.
label_clusters(vals, eps=0.001, min_samples=10)
For determining the nominal values of data in a series containing one or more
nominal values with some fluctuations. The data is first normalized using
sklearn.preprocessing.StandardScaler()
, then clustered using
sklearn.cluster.DBSCAN()
.
It is assumed that all data belongs to a cluster (i.e. there are no outliers). If
this is not the case, eps
is increased by a factor of 10 and the clustering is
tried again.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
vals |
pd.Series
|
A series of data containing one or more nominal values with some fluctuations. |
required |
eps |
float
|
Passed to |
0.001
|
min_samples |
int
|
Passed to |
10
|
Returns:
Type | Description |
---|---|
np.ndarray
|
An array of the same size as |
Source code in magnetopy\parsing_utils.py
unique_values(x, eps=0.001, min_samples=10, ndigits=0)
Given a series of data containing one or more nominal values with some noise, returns a list of the nominal values.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
pd.Series
|
A series of data containing one or more nominal values with some noise. |
required |
eps |
float
|
Passed to |
0.001
|
min_samples |
int
|
Passed to |
10
|
ndigits |
int
|
The number of digits after the decimal point to round the nominal values to, by default 0. |
0
|
Returns:
Type | Description |
---|---|
list[float]
|
The nominal values in |
Source code in magnetopy\parsing_utils.py
find_outlier_indices(x, threshold=3)
Finds the indices of outliers in a series of data.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
pd.Series
|
A series of data. |
required |
threshold |
float
|
The number of standard deviations from the mean to consider a value an outlier, by default 3. |
3
|
Returns:
Type | Description |
---|---|
list[int]
|
The indices of the outliers in |
Source code in magnetopy\parsing_utils.py
find_temp_turnaround_point(df, num_endpoints_ignored=20)
Finds the index of the temperature turnaround point in a dataframe of a ZFCFC experiment which includes a column "Temperature (K)". Can handle two cases in which a single dataframe contains first a ZFC experiment, then a FC experiment: - Case 1: ZFC temperature monotonically increases, then FC temperature monotonically decreases. - Case 2: ZFC temperature monotonically increases, the temperature is reset to a lower value, then FC temperature monotonically increases.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
df |
pd.DataFrame
|
A dataframe of a ZFCFC experiment which includes a column "Temperature (K)". |
required |
num_endpoints_ignored |
int
|
The number of endpoints to ignore when finding the turnaround point, by default 20. This is useful when dealing with data collecting by scanning temperature, as there are often 20 or so points at the end of the scane where the temperature is very slowly settling. |
20
|
Returns:
Type | Description |
---|---|
int
|
The index of the temperature turnaround point. |
Source code in magnetopy\parsing_utils.py
find_sequence_starts(x, flucuation_tolerance=0)
Find the indices of the start of each sequence in a series of data,
where a sequences is defined as a series of numbers that constantly increase or decrease.
Changes below fluctuation_tolerance
are ignored.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
pd.Series
|
A series of data. |
required |
flucuation_tolerance |
float
|
Changes below this value are ignored, by default 0. |
0
|
Examples: