Join the MathsGee Study Questions & Answers Club and get expert verified answers.
First time here? Checkout the FAQs!
x

*Math Image Search only works best with SINGLE, zoomed in, well cropped images of math. No selfies and diagrams please :)

For Example

Math Image Search 1
Math Image Search 2
MathsGee Q&A Android App
0 like 0 dislike
26 views
A first step towards identifying spam is to create a list of words that are more likely to appear in spam than in normal messages. For instance, words like buy or the brand name of an enhancement drug are more likely to occur in spam messages than in normal messages. Suppose a specified list of words is available and that your data base of 5000 messages contains 1700 that are spam. Among the spam messages, 1343 contain words in the list. Of the 3300 normal messages, only 297 contain words in the list.

Obtain the probability that a message is spam given that the message contains words in the list.
in Data Science & Statistics by Diamond (40,719 points) | 26 views

1 Answer

0 like 0 dislike
Best answer
Let \(A=[\) message contains words in list \(]\) be the event a message is identified as spam and let \(B_{1}=\) [message is spam] and \(B_{2}=\) [message is normal]. We use the observed relative frequencies from the data base as approximations to the probabilities.
\[
\begin{gathered}
P\left(B_{1}\right)=\frac{1700}{5000}=.34 \quad P\left(B_{2}\right)=\frac{3300}{5000}=.66 \\
P\left(A \mid B_{1}\right)=\frac{1343}{1700}=.79 \quad P\left(A \mid B_{2}\right)=\frac{297}{3300}=.09
\end{gathered}
\]
Bayes' Theorem expresses the probability of being spam, given that a message is identified as spam, as
\[
P\left(B_{1} \mid A\right)=\frac{P\left(A \mid B_{1}\right) P\left(B_{1}\right)}{P\left(A \mid B_{1}\right) P\left(B_{1}\right)+P\left(A \mid B_{2}\right) P\left(B_{2}\right)}
\]
The updated, or posterior probability, is
\[
P\left(B_{1} \mid A\right)=\frac{.79 \times .34}{.79 \times .34+.09 \times .66}=\frac{.2686}{.328}=.819
\]
Because this posterior probability of being spam is quite large, we suspect that this message really is spam. Since \(P\left(B_{1}\right)=.34\), or \(34 \%\) of the incoming messages are spam, we likely would want the spam filter to remove this message. Existing spam filer programs learn and improve as you mark your incoming messages spam.
by Diamond (40,719 points)

Related questions

2 like 0 dislike
1 answer
0 like 0 dislike
1 answer
0 like 0 dislike
1 answer
0 like 0 dislike
1 answer
1 like 0 dislike
1 answer
0 like 0 dislike
1 answer
2 like 0 dislike
1 answer

Join the MathsGee Study Questions & Answers Club where you get study and financial support for success from our community. LEARN - CONNECT - EARN


On the MathsGee Study Questions & Answers, you can:


1. Ask questions


2. Answer questions


3. Vote on Questions and Answers


4. Tip your favourite community member(s)


5. Create Live Video Tutorials (Paid/Free)


6. Join Live Video Tutorials (Paid/Free)


7. Earn points for participating



Posting on the MathsGee Study Questions & Answers


1. Remember the human


2. Behave like you would in real life


3. Look for the original source of content


4. Search for duplicates before posting


5. Read the community's rules




Q&A RULES


1. Answers to questions will be posted immediately after moderation


2. Questions will be queued for posting immediately after moderation


3. Depending on how many posts we receive, you could be waiting up to 24 hours for your post to appear. But, please be patient as posts will appear after they pass our moderation.


MathsGee Q&A Android App