public class EnergyAnalyser extends FrameBasedAnalyser<Double>
FrameBasedAnalyser.FrameAnalysisResult<T>
Modifier and Type | Field and Description |
---|---|
protected int |
DEFAULT_MAXSIZE |
protected double[] |
frameEnergies
array of frame energies, for further analysis
|
protected int |
len
Length of valid data, counting from offset.
|
protected int |
maxSize
maximum size of the double[] storing the frame energies
|
protected int |
offset
Beginning of valid data in frameEnergies; will be >0 only after more than maxSize frames have been read.
|
analysisResults
frame, frameLength, frameShift, frameStart, nextFrameStart, processor, samplingRate, signal, totalRead, validSamplesInFrame
Constructor and Description |
---|
EnergyAnalyser(DoubleDataSource signal,
int framelength,
int samplingRate) |
EnergyAnalyser(DoubleDataSource signal,
int framelength,
int frameShift,
int samplingRate) |
EnergyAnalyser(DoubleDataSource signal,
int framelength,
int frameShift,
int samplingRate,
int maxSize) |
Modifier and Type | Method and Description |
---|---|
Double |
analyse(double[] frame)
Apply this FrameBasedAnalyser to the given data.
|
static void |
energySegmentation(String[] args)
Segment a WAVE file by energy, ideally one word per segment (the result might contain more); the result is saved in a file
in transcriber format so the segmentation can be easily inspected and corrected.
|
double[] |
getEnergyHistogram()
Compute a histogram of energies found in the data.
|
double[] |
getEnergyHistogram(int nbins)
Compute a histogram of energies found in the data.
|
double |
getMaxFrameEnergy()
Compute the overall maximum energy in all frames.
|
double |
getMeanFrameEnergy()
Compute the overall mean energy in all frames.
|
double |
getMinFrameEnergy()
Compute the overall minimum energy in all frames.
|
double |
getSilenceCutoff()
Determine the energy level below which to find silence.
|
double |
getSilenceCutoffFromKMeansClustering(double shiftFromMinimumEnergyCenter,
int numClusters) |
double |
getSilenceCutoffFromSortedEnergies(FrameBasedAnalyser.FrameAnalysisResult[] far,
double silenceThreshold) |
double[][] |
getSpeechStretches()
For the current audio data and the automatically calculated silence cutoff, compute a list of start and end times
representing speech stretches within the file.
|
double[][] |
getSpeechStretchesUsingEnergyHistory(int energyBufferLength,
double speechStartLikelihood,
double speechEndLikelihood,
double shiftFromMinimumEnergyCenter,
int numClusters)
The latest version uses K-Means clustering to cluster energy values into 3 separate clusters.
|
static void |
main(String[] args) |
protected void |
rememberFrameEnergy(double energy) |
analyseAllFrames, analyseAvailableFrames, analyseNextFrame, constructAnalysisResult
getCurrentFrame, getData, getFrameLengthSamples, getFrameLengthTime, getFrameShiftSamples, getFrameShiftTime, getFrameStartSamples, getFrameStartTime, getNextFrame, getSamplingRate, hasMoreData, resetInternalTimer, stopWhenTouchingEnd, validSamplesInFrame
protected final int DEFAULT_MAXSIZE
protected double[] frameEnergies
protected int offset
protected int len
protected int maxSize
public EnergyAnalyser(DoubleDataSource signal, int framelength, int samplingRate)
public EnergyAnalyser(DoubleDataSource signal, int framelength, int frameShift, int samplingRate)
public EnergyAnalyser(DoubleDataSource signal, int framelength, int frameShift, int samplingRate, int maxSize)
public Double analyse(double[] frame)
analyse
in class FrameBasedAnalyser<Double>
frame
- the data to analyse, which must be of the length prescribed by this FrameBasedAnalyser, i.e. by works like
FrameProvider.getFrameLengthSamples()
.IllegalArgumentException
- if frame does not have the prescribed lengthprotected void rememberFrameEnergy(double energy)
public double getMeanFrameEnergy()
public double getMaxFrameEnergy()
public double getMinFrameEnergy()
public double[] getEnergyHistogram()
public double[] getEnergyHistogram(int nbins)
nbins
- the number of bins to compute, e.g. 100public double getSilenceCutoff()
public double getSilenceCutoffFromSortedEnergies(FrameBasedAnalyser.FrameAnalysisResult[] far, double silenceThreshold)
public double[][] getSpeechStretches()
signalproc.minsilenceduration
(default: 0.1 (seconds))
signalproc.minspeechduration
(default: 0.1 (seconds))
public double getSilenceCutoffFromKMeansClustering(double shiftFromMinimumEnergyCenter, int numClusters)
public double[][] getSpeechStretchesUsingEnergyHistory(int energyBufferLength, double speechStartLikelihood, double speechEndLikelihood, double shiftFromMinimumEnergyCenter, int numClusters)
energyBufferLength
- energyBufferLengthspeechStartLikelihood
- speechStartLikelihoodspeechEndLikelihood
- speechEndLikelihoodshiftFromMinimumEnergyCenter
- shiftFromMinimumEnergyCenternumClusters
- numClusterspublic static void energySegmentation(String[] args) throws Exception
args
- : first argument is the directory where the wav files are, next arguments in the list are the files for
segmenting.Exception
- : IOException, UnsupportedAudioFile exception and IllegalArgumentException when the file is not mono, it just
handles mono audio signals.Copyright © 2000–2016 DFKI GmbH. All rights reserved.