Python Package pyperspec¶

This is a Python package designed to simplify the analysis and manipulation of hyperspectral datasets. The package provides an object-oriented approach providing a user-friendly interface that feels familiar to Python users, i.e. close to libraries such as numpy, pandas, and scikit-learn.

This is heavily inspired by R package hyperSpec and part of r-hyperspec. The goal is to make the work with hyperspectral data sets, (i.e. spatially or time-resolved spectra, or spectra with any other kind of information associated with each of the spectra) more comfortable. The spectra can be data obtained during XRF, UV/VIS, Fluorescence, AES, NIR, IR, Raman, NMR, MS, etc. spectroscopy measurements.

NOTE: The main focus is not on algroithms since there are already many other good packages implementing algorithms, e.g. numpy, scipy, pybaselines, and more can be found in FOSS For Spectroscopy list.

Rather, it provides convinient interface for those algorithms and other routine tasks.

For detailed information and documentation, please visit Documentation.

Documentation¶

Please, check here

Installation¶

Currently available only from GitHub:

pip install git+https://github.com/r-hyperspec/pyperspec.git

Quick Demo¶

import pyspc
import numpy as np
import pandas as pd

spc = np.random.rand(10, 20) # Here are your spectra in unfolded structure
wl = np.linspace(1000,2000,20) # Array of wavelength/wavenumbers
meta_data = pd.DataFrame({"group": ["A", "B"] * 5, "date": pd.date_range("2023-01-01", periods=10)}) # Additional meta-data

# Create the object
sf = pyspc.SpectraFrame(spc, wl=wl, data=meta_data)

# Easy meta-data manipulation
sf.A
sf["A"]
sf["group"] = ["Control", "Treatment"] * 5

# Easy data slicing/filtering, similar to hyperSpec
sf[:,:,500:1000] # Cut wavelength range to [500, 1000]
sf[:5,:,:5, True] # Use iloc style to get only first five spectra and first five wavenumbers
sf.query("group == 'Control'") # Get only 'Control' group

# Simple aggregation even with custom methods
sf[:,:,500:1000].mean(groupby=["group", "date"])
sf.query("group == 'Control'").apply(lambda x: np.sum(x**2), axis=0)

# Chaining methods
sf_processed = (
    sf.query("group == 'Control'")
    .mean(groupby="date")
    .smooth("savgol", window_length=7, polyorder=2)
    .sbaseline("rubberband")
    .normalize("area")
)

# Select 3 random spectra and plot them colored by "date"
sf.sample(3).plot(colors="date")

# Export to wide pandas DataFrame
sf.to_pandas()

Acknowlegments¶

This project was a continuation of ibcp/pyspectra. We acknowlege support and contribution of Emanuel Institute of Biochemical Physics, RAS
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie Actions (Grant Agreement 861122) as part of IMAGE-IN project.
The project was developed as part of secondment at Chemometrix GmbH
Supervision was from Chemometrix GmbH, Leibniz-IPHT, and BMD Software

Emanuel Institute of Biochemical Physics, RAS Horizon 2020 IMAGE-IN Leibniz-IPHT BMD Software