  1. class MultinomialNB Found at: sklearn.naive_bayes
  2. class MultinomialNB(-title class_ inherited__">BaseDiscreteNB):
  3. """
  4. Naive Bayes classifier for multinomial models
  5. The multinomial Naive Bayes classifier is suitable for classification with
  6. discrete features (e.g., word counts for text classification). The
  7. multinomial distribution normally requires integer feature counts. However,
  8. in practice, fractional counts such as tf-idf may also work.
  9. Read more in the :ref:`User Guide <multinomial_naive_bayes>`.
  10. Parameters
  11. ----------
  12. alpha : float, optional (default=1.0)
  13. Additive (Laplace/Lidstone) smoothing parameter
  14. (0 for no smoothing).
  15. fit_prior : boolean, optional (default=True)
  16. Whether to learn class prior probabilities or not.
  17. If false, a uniform prior will be used.
  18. class_prior : array-like, size (n_classes,), optional (default=None)
  19. Prior probabilities of the classes. If specified the priors are not
  20. adjusted according to the data.
  21. Attributes
  22. ----------
  23. class_log_prior_ : array, shape (n_classes, )
  24. Smoothed empirical log probability for each class.
  25. intercept_ : property
  26. Mirrors ``class_log_prior_`` for interpreting MultinomialNB
  27. as a linear model.
  28. feature_log_prob_ : array, shape (n_classes, n_features)
  29. Empirical log probability of features
  30. given a class, ``P(x_i|y)``.
  31. coef_ : property
  32. Mirrors ``feature_log_prob_`` for interpreting MultinomialNB
  33. as a linear model.
  34. class_count_ : array, shape (n_classes,)
  35. Number of samples encountered for each class during fitting. This
  36. value is weighted by the sample weight when provided.
  37. feature_count_ : array, shape (n_classes, n_features)
  38. Number of samples encountered for each (class, feature)
  39. during fitting. This value is weighted by the sample weight when
  40. provided.
  41. Examples
  42. --------
  43. >>> import numpy as np
  44. >>> X = np.random.randint(5, size=(6, 100))
  45. >>> y = np.array([1, 2, 3, 4, 5, 6])
  46. >>> from sklearn.naive_bayes import MultinomialNB
  47. >>> clf = MultinomialNB()
  48. >>>, y)
  49. MultinomialNB(alpha=1.0, class_prior=None, fit_prior=True)
  50. >>> print(clf.predict(X[2:3]))
  51. [3]
  52. Notes
  53. -----
  54. For the rationale behind the names `coef_` and `intercept_`, i.e.
  55. naive Bayes as a linear classifier, see J. Rennie et al. (2003),
  56. Tackling the poor assumptions of naive Bayes text classifiers, ICML.
  57. References
  58. ----------
  59. C.D. Manning, P. Raghavan and H. Schuetze (2008). Introduction to
  60. Information Retrieval. Cambridge University Press, pp. 234-265.
  62. classification-1.html
  63. """
  64. def __init__(self, alpha=1.0, fit_prior=True, class_prior=None):
  65. self.alpha = alpha
  66. self.fit_prior = fit_prior
  67. self.class_prior = class_prior
  68. def _count(self, X, Y):
  69. """Count and smooth feature occurrences."""
  70. if np.any(( if issparse(X) else X) < 0):
  71. raise ValueError("Input X must be non-negative")
  72. self.feature_count_ += safe_sparse_dot(Y.T, X)
  73. self.class_count_ += Y.sum(axis=0)
  74. def _update_feature_log_prob(self, alpha):
  75. """Apply smoothing to raw counts and recompute log probabilities"""
  76. smoothed_fc = self.feature_count_ + alpha
  77. smoothed_cc = smoothed_fc.sum(axis=1)
  78. self.feature_log_prob_ = np.log(smoothed_fc) - np.log(smoothed_cc.
  79. reshape(-1, 1))
  80. def _joint_log_likelihood(self, X):
  81. """Calculate the posterior log probability of the samples X"""
  82. check_is_fitted(self, "classes_")
  83. X = check_array(X, accept_sparse='csr')
  84. return safe_sparse_dot(X, self.feature_log_prob_.T) + self.class_log_prior_


