So once again, this isĪ box-and-whiskers plot of the same data set without outliers.After you check the distribution of the data by plotting the histogram, the second thing to do is to look for outliers. We have these outliers, we would put this, we So we're gonna, we are going to start at six and go all the way to 19. Is in our data set, but it is not an outlier. (chuckles) our non-outliers, we would start at six 'cause six we're saying And what we can do instead is say, all right, including That they're outliers, well, let's not include them. Include those outliers, we want to make it clear We're going all the way, all the way from one to 19. And so our entire range, we go, actually let me draw it a So one way to do it is to, hey, we start at one. The entire range here? Well, we have things that goįrom one all the way to 19. Now if we don't want to consider outliers, we would say, well, what's Now let me draw that as an actual, let me actually draw that as a box. Now if we were to just draw a classic box-and-whiskers plot here, we would say, all right, That's one, and then let me putĪnother one down there.
So I'll put another, another, actually let me do two here. And let's actually drawĪ box-and-whiskers plot. And you could do it either taking in consideration your outliers or not taking intoĬonsideration your outliers. Now another thing to think about is drawing box-and-whiskers plots based on Q-one, our median, our range, all the range of numbers. Only have two outliers, that only these two We're not just subjectively saying, well, this feels right So based on this, we have a, kind of a numerical definitionįor what's an outlier.
#Box and whisker plot with outliers plus#
18 plus 7.5 is 25.5, or outliers, outliers greater than 25, 25.5. Or the Q-three is 18, this is, once again, 7.5. 13 minus 7.5 is what? 13 minus seven is six, and then you subtract another. So this is going to be 13 minus 1.5 times our interquartile range. Or one could argue it shouldīe one, or two, or whatever. And once again, this is somewhat, you know, people justĭecided it felt right. The interquartile range, interquartile range. Greater than Q-three plus one and half times It's something that's more than one and half times the interquartile range below Q-one. This is something that statisticians have kind of said, well, if we want to have a betterĭefinition for outliers, let's just agree that So outliers, outliers, are going to be less than our Q-one minus 1.5, times our interquartile range. Now to figure out outliers, well, outliers are gonnaīe anything that is below. Between 18 and 13, well, that is going to be 18 minus 13, which is equal to five. Range going to be? Interquartile range is going to be equal to Q-three minus Q-one, the difference between 18 and 13. And then Q-three is going to be the middle of this upper group. It has three and three, three to the left, three to the right. This first group has seven numbers in it. Now what is Q-one? Well, Q-one is going to be the Three, four, five, six, seven numbers on the right side too. Is that right? Yep, six, seven, so that's the median. Middle number is going to be whatever number has seven on either side. All right, so what's the median here? Well, the median is the middle number.
Out by that definition, what is going to be an outlier? And if that all made sense to you so far, I encourage you to pause this video and try to work through it on your own, or I'll do it for you right now. Well, what am I talking about? Well, let's actually, let'sįigure out the median, Q-one and Q-three here. The interquartile range from below Q-one or above Q-three, well, those are going to be outliers. We say, well, anything that is more than one and a half times Now to get on the same page, statisticians will use a rule sometimes.
"Maybe only these two ones are outliers." And those would actually beīoth reasonable things to say. "There are these two ones and the six." Some people might say, "Well, the six is kinda close enough. And so some people might say, "Okay, we have three outliers. The distribution of numbers, it looks like the meat of theĭistribution, so to speak, is in this area, right over here. So when you look, when you look visually at So here, on a number line, I have all the numbers from one to 19. Let's actually visualize this, the distribution of actual numbers. List of 15 numbers here, and what I want to do is