Python Package pyperspec
This is a Python package designed to simplify the analysis and manipulation of hyperspectral datasets.
The package provides an object-oriented approach providing a user-friendly interface that feels familiar to Python users, i.e. close to libraries such as numpy
, pandas
, and scikit-learn
.
This is heavily inspired by R package hyperSpec and part of r-hyperspec
.
The goal is to make the work with hyperspectral data sets, (i.e. spatially or time-resolved spectra, or spectra with any other kind of information associated with each of the spectra) more comfortable.
The spectra can be data obtained during
XRF,
UV/VIS,
Fluorescence,
AES,
NIR,
IR,
Raman,
NMR,
MS,
etc. spectroscopy measurements.
NOTE: The main focus is not on algroithms since there are already many other good packages implementing algorithms, e.g. numpy, scipy, pybaselines, and more can be found in FOSS For Spectroscopy list.
Rather, it provides convinient interface for those algorithms and other routine tasks.
For detailed information and documentation, please visit PyPerSpec Documentation.
Documentation
Please, check here
Installation
Currently available only from GitHub:
Quick Demo
import pyspc
import numpy as np
import pandas as pd
spc = np.random.rand(10, 20) # Here is you spectra in unfolded structure
wl = np.linspace(1000,2000,20) # Array of wavelength/wavenumbers
meta_data = pd.DataFrame({"group": ..., "date": ...,}) # Additional meta-data
# Create the object
sf = pyspc.SpectraFrame(spc, wl=wl, data=data)
# Easy meta-data manipulation
sf.A
sf["A"]
sf["E"] = ...
# Easy data slicing/filtering, similar to hyperSpec
sf[:,:,500:1000] # Cut wavelenght range to [500, 1000]
sf[:5,:,:5, True] # Use iloc style to get only first five spectra and first five wavenumbers
sf.query("group == 'Control'") # Get only 'Control' group
# Simple aggregation even with custom methods
sf[:,:,500:1000].mean(groupby=["group", "date"])
sf.query("group = 'Control'").apply(lamda x: np.sum(x**2), axis=0)
# Chaining methods
sf_processed = (
sf.query("group = 'Control'")
.mean(groupby="date")
.smooth("savgol", window_length=7, polyorder=2)
.sbaseline("rubberband")
.normalize("area")
)
# Select 3 random spectra and plot them colored by "date"
sf.sample(3).plot(colors="date")
# Export to wide pandas DataFrame
sf.to_pandas()
Acknowlegments
- This project was a continuation of ibcp/pyspectra. We acknowlege support and contribution of Emanuel Institute of Biochemical Physics, RAS
- This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie Actions (Grant Agreement 861122) as part of IMAGE-IN project.
- The project was developed as part of secondment at Chemometrix GmbH
- Supervision was from Chemometrix GmbH, Leibniz-IPHT, and BMD Software