Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-17716

Hidden Markov Model (HMM)

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Incomplete
    • None
    • None
    • MLlib

    Description

      Had an offline chat with Lil'Rex, who implemented HMM on Spark at https://github.com/apache/spark/compare/master...lilrex:sequence. I asked him to list popular HMM applications, describe public API (params, input/output schemas), compare its API with existing HMM implementations.

      Hidden Markov Model (HMM) Design Doc

      Overview

      Introduction to HMM

      Hidden Markov Model is a type of statistical Machine Learning model that assumes a sequence of observations is generated by a Markov process with hidden states. There are 3 (or 2, depending on the implementation) main components of the model:

      • Transition Probability: describes the probability distribution of transitions from each state to other states (including self) in the Markov process
      • Emission Probability: describes the probability distribution for an observation associated with hidden states
      • Initial/Start Probability (optional): represents the prior probability of each state at the beginning of the observation sequence

      Note: some implementations merge the Initial Probability into Transition Probability by adding an arbitrary Start state before the first observation point.

      HMM Models and Algorithms

      Given a limited number of states, most HMM models have the same form of Transition Probability: a matrix, where each element (i, j) represents the probability of transition from state i to state j. The Initial Probability usually take the simple form of a probabilistic vector.

      The Emission Probability, on the other hand, can be represented in many different ways, depending on different nature of observations, i.e. continuous vs. discrete, or different model assumptions, e.g. single Gaussian vs. Gaussian Mixtures.

      There are three main problems associated with HMM models, and their canonical algorithms:

      1. Evaluation: What’s the probability of a given observation sequence, based on the model? It’s usually done by either Forward or Backward algorithms
      2. Decoding: What’s the most likely state sequence, given the observation sequence and the model? It’s usually done by Viterbi decoding
      3. Learning: How to train the parameters of the model based on the observation sequences? Baum-Welch (Forward-Backward) is usually used as part of the EM algorithm in unsupervised training

      Popular Applications of HMM

      • Speech Recognition
      • Part-of-speech Tagging
      • Named Entity Recognition
      • Machine Translation
      • Gene Prediction

      Alternate Libraries

      Mahout

      • Assume emission probability to be an m-by-n matrix
      • Use Baum-Welch algorithm for training and Viterbi algorithm for prediction
      • API (command line)
        • training
          $ echo "0 1 2 2 2 1 1 0 0 3 3 3 2 1 2 1 1 1 1 2 2 2 0 0 0 0 0 0 2 2 2 0 0 0 0 0 0 2 2 2 3 3 3 3 3 3 2 3 2 3 2 3 2 1 3 0 0 0 1 0 1 0 2 1 2 1 2 1 2 3 3 3 3 2 2 3 2 1 1 0" > hmm-input
          $ export MAHOUT_LOCAL=true
          $ $MAHOUT_HOME/bin/mahout baumwelch -i hmm-input -o hmm-model -nh 3 -no 4 -e .0001 -m 1000
        • prediction
          $ $MAHOUT_HOME/bin/mahout hmmpredict -m hmm-model -o hmm-predictions -l 10
          $ cat hmm-predictions

      Mallet

      • Treat HMM as a Finite State Transducer (FST)
      • Theoretically can go beyond first-order Markov assumption by setting an arbitrary order
      • Limited to text data, i.e. discrete observation sequence with Multinomial emission model assumption
      • Supervised training only
      • API:
        • Training:
          HMM hmm = new HMM(pipe, null);
          hmm.addStatesForLabelsConnectedAsIn(trainingInstances);
          HMMTrainerByLikelihood trainer = new HMMTrainerByLikelihood(hmm);
          trainer.train(trainingInstances, 10);
        • Testing:
          evaluator.evaluate(trainer);

      HMMLearn

      • Previously part of scikit-learn
      • Algo:
        • Standard HMM unsupervised training algorithm
        • Three types of emission models: GMM, Gaussian and Multinomial
      • API:
        • Training:
          model = hmm.GaussianHMM(n_components=3, covariance_type="full")
          model.fit(X)
        • Testing:
          hidden_states = model.predict(X)

      API

      Design Goals

      • Build the foundation for general Sequential Tagging models (HMM, CRF, etc.)
      • Support multiple Emission Probability models such as “Multinomial” and “Gaussian Mixture”
      • Keep both supervised and unsupervised learning for HMM in mind

      Proposed API

      Note: This is written for the spark.ml API.

      Decoder API

      Decoder.scala
      trait DecoderParams extends Params {
        def featureCol: DataType // column of sequential features, e.g. MatrixUDT
        def predictionCol: DoubleType // column of prediction
        def labelCol: DataType // column of sequential labels, e.g. VectorUDT
      }
      
      abstract class Decoder extends Estimator with DecoderParams {
        def extractLabeledSequences(dataset: Dataset[_]): RDD[LabeledSequence]
      }
      
      abstract class DecodingModel extends Model with DecoderParams {
        def numFeatures: Int
        def decode(features: FeatureType): Vector
      }
      

      Tagger API

      Tagger.scala
      trait TaggerParams extends DecoderParams with HasRawPredictionCol {
        def rawPredictionCol: MatrixUDT // column for all predicted label sequences
      }
      
      abstract class Tagger extends Decoder with TaggerParams
      
      abstract class TaggingModel extends DecodingModel with TaggerParams {
        def numClasses: Int
        def decodeRaw(features: FeaturesType): Array[(Double, Vector)]
        def raw2prediction(rawPrediction: Array[(Double, Vector)]): Vector
      }
      

      MarginalTagger (Probabilistic Tagger) API

      MarginalTagger.scala
      trait MarginalTaggerParams extends TaggerParams with HasProbabilityCol with HasThreshold {
        def probabilityCol: DoubleType // column for probability
      }
      
      abstract class MarginalTagger extends Tagger with MarginalTaggerParams
      
      abstract class MarginalTaggingModel extends TaggingModel with MarginalTaggerParams {
        def getMargin(featuers: FeaturesType): Double
      }
      

      HiddenMarkovModel API

      HiddenMarkovModel.scala
      trait HiddenMarkovModelParams extends MarginalTaggerParams with HasMaxIter with HasTol with HasStandardization with HasThreshold {
        def smoothing: DoubleParam
        def emissionType: Param[String] //can be either Multinomial or Gaussian
      }
      
      class HiddenMarkovModel extends MarginalTagger with HiddenMarkovModelParams {
        def initialModel: Option[HMMModel] //initial model before training
        def def trainSupervised(dataset: Dataset[_]): Option[HMMModel]
        def trainUnsupervised(dataset: Dataset[_]): HMMModel
      }
      
      abstract class HMMModel extends MarginalTaggingModel with HiddenMarkovModelParams {
        def initialProb: Vector // initial probability for states
        def transitionProb: DenseMatrix // transition probability between states
      
        //compute feature scores for all states in all frames in a sequence, different for different emission models, e.g. Multinomial vs. GMM
        def precomputeEmission(features: Matrix): List[Array[Double]] 
      
        // accumulate sufficient statistics for emission model  
        def getEmissionStats(features: Matrix, gammas: List[Array[Double]]): DenseMatrix
      
        // forward algorithm
        def forward(emissions: Traversable[Array[Double]]): List[Array[Double]] 
      
        // backword algorithm
        def backward(emissions: Traversable[Array[Double]]):List[Array[Double]] 
      }
      
      class MultinomialHMMModel extends HMMModel {
        def emissionProb: Matrix // emission probability from states to observations
      }
      
      object MultinomialHMMModel extends MLReadable[MultinomialHMMModel]
      
      class HMMModelReader extends MLReader[MultinomialHMMModel]
      

      Attachments

        Activity

          People

            Lil'Rex Runxin Li
            mengxr Xiangrui Meng
            Votes:
            1 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: