Thursday, December 12, 2013

How To Remove Stop words From The String Using PHP?

I want to strip whole bad words from a string wherever they are inside the string.
I have created an array of forbidden words or stop words.
$string = 'sand band or nor and where whereabouts foo';
$stopwords = array("or", "and", "where");
There are two ways to achieve it. For example:
Alternative 1
echo preg_replace("\b$stopwords\b", "", $string);
//output : sand band  nor   whereabouts foo
Alternative 2
foreach ($stopwords as &$word) {
    $word = '/\b' . preg_quote($word, '/') . '\b/';
}
echo preg_replace($stopwords, '', $string);
//output : sand band  nor   whereabouts foo
Ref: http://codepad.org/lrXknCO4 && http://stackoverflow.com/a/9342200

3 comments:

  1. strarray = array();
    $strarray = explode(' ', $string);
    $new_array_without_stopwords = array_diff($strarray, $stopwords);

    ReplyDelete
    Replies
    1. Good Job.

      You can remove stop words from the string using array_diff function as well.

      P:S: array_diff function returns an array containing all the entries from array1 that are not present in any of the other arrays.

      Cheers,
      Anup

      Delete
  2. Thank you so much for this, I was looking for a solution whereby I didn't have to store the regular expression syntax as part of the word in the stop word array. Worked perfectly for me. Thanks again.

    ReplyDelete