Business Data Model

Question 1 (10 marks) Sandra Enright of Techtronics Inc. , an electronics supply firm, has been examining the times required for stock pickers to fill orders requested from inventory. She has determined that individual order-filling times approximately follow a normal distribution with a mean value of 3. 2 minutes and standard deviation of 68 seconds. a)What is the probability that a randomly selected order will require more than three minutes? ) What is the probability that a randomly selected order will require less than two minutes? c) What is the probability that a randomly selected order will require between two and three minutes? d) Sandra is considering a quality assurance that 95% of orders will be filled within a specified time. What time should she specify? Question 2 (15 marks) Gerald Black of BlackFly Airline has an exclusive contract to run flights of a four-passenger aircraft to a remote mining center. His contract requires him to fly if there are any passengers wanting to make the trip.

His fixed costs per day are $400. 00, his fixed costs per flight are $1,200. 0, the variable cost per passenger is $25. 00, and he charges $850. 00 per passenger. He has tracked the number of passengers who flew with him over the past sixty days. His findings are summarized in the following table: Number of Passengers| 0| 1| 2| 3| 4| Number of Days| 5| 12| 15| 21| 7| Of course, he does not fly on days with zero passengers. Assume that this sample gives a good approximation to his future demand patterns. Let G be the random variable: profit on a future day. a) Calculate the Expected Value, E[ G ], Variance, 2[ G ] and standard deviation, [ G ], of his future daily profit. Hint: You can calculate a profit corresponding to each number of passengers. The probabilities of those profits are then determined by the probabilities of the numbers of passengers. ] b) Comment briefly on the profitability and volatility of Gerald’s business. Question 3 (15 marks) Local development initiatives often use estimates of the daily expenditures of tourists to justify expenses incurred in supporting local events. Some years ago, the City of Kingston hosted a Tall Ships weekend, which cost the City some $850,000.

To justify this expense, suppose that the City conducted a survey of 30 out-of-town visitors, asking them the grand total of what they spent during their visit to Kingston, and how many days they visited. The data are contained in the file visitor expenditures. xlsx, and are summarized below. a. What is a 95% confidence interval for the average daily expenditure by visitors to Kingston, based on these data? Interpret the meaning of your interval, in English. b. The Mayor of Kingston at the time, Ms. Turner, had stated that “the average visitor to Kingston spends $100 per day in the local economy”. Set up Ms.

Turner’s comment as a hypothesis test, and use the data to establish whether her statement can be refuted, or not. c. Some visitors to Kingston are Canadian, and some from other parts of the world. A sample of 200 visitors on this weekend revealed that 120 were Canadian, and 80 from elsewhere. What is a 95% confidence interval for the proportion of visitors who are Canadian on a weekend such as this? d. The Mayor had also stated that “more than 50% of the visitors to Kingston are Canadians”. Set up this statement by the Mayor in a hypothesis test framework, and use the data to determine if her statement can be refuted, or not.

Question 4 (10 marks) Fast Computers Inc. supplies made-to-order personal computers through direct (telephone and online) sales channels. A key competitive feature of its business is the delivery time – the time lapse between receipt of an order and final delivery to the customer. By tracking thousands of previous orders, the company has found that delivery times to the large Toronto market are well-approximated by a normal distribution with a mean value of 5. 4 days and standard deviation of 1. 34 days. In answering the following questions, you may assume that these parameters are known exactly. ) What is the probability that a randomly selected order will require more than six days? b) What is the probability that a randomly selected order will require between three and six days? c) The company is considering a $50 cash-back guarantee on any orders that have delivery times longer than a specified maximum but are not sure what maximum time to guarantee. They would be willing to pay out the guarantee on no more than 3% of their deliveries. What time limit in whole days should they set for their guarantee? d) What is the probability that the mean time for a sample of fifteen randomly selected orders will be more than six days?

Assignment One – Solutions Question 1 a) 1 – NORMDIST(3,3. 2,1. 133,1) = 0. 57 b) NORMDIST(2,3. 2,1. 133,1) = 0. 1448 c) NORMDIST(3,3. 2,1. 133,1) – NORMDIST(2,3. 2,1. 133,1) = 0. 2851 d) =NORMINV(0. 95,3. 2,1. 133) = 5. 064 minutes Question 2 Part a Part b Daily profit is quite good at about $329, but note that the standard deviation is large by comparison at about $794. Gerald can thus expect to experience a lot of volatility in his profits. Question 3 a) We first must express the data in a comparable way, by computing the equivalent daily expenditure for each visitor, dividing the total by the umber of days, as shown in the spreadsheet visitors analysis. xlsx. We can then use the spreadsheet command Tools, Data Analysis, Descriptive Statistics to find the summary measures for the data: Note the sample mean (average) daily expenditure is $76. 87, and the standard error is $4. 254. A 95% confidence interval for the population mean can be found as the sample mean 1. 96 * standard errors, or 76. 86 1. 96 * 4. 254, or 76. 86 8. 34. You could also express this as the range [68. 52, 85. 20]. The interpretation is that Ms. Turner can be quite sure that the actual true average daily expenditure is somewhere in the range from $68. 2 to $86. 20 (but it could be anywhere in this range). b) Let the symbol represent the actual mean daily expenditure of visitors to Kingston. One way to test her assertion is to assume that she is right, and then see if the evidence would cause us to reject this assumption, and therefore conclude that she is wrong. In this framework, we can express Ms. Turner’s statement as the null hypothesis: H0: 100 Versus the alternative hypothesis H1: < 100 (Here, Ms. Turner is evidently making the case that the expenses are justified because visitors contribute so much to Kingston’s economy.

Her case would be even more supported if the mean expenditures was > $100, so the format for the null hypothesis above is the correct one for this problem. The hypotheses H0: = 100 H1: 100 are not appropriate since rejecting the null hypothesis wouldn’t indicate whether the expenditures were more or less than $100 per day, and our conclusions would be different in each of these two cases. You could also have run the test by setting up the null hypothesis under the assumption that she is not correct in her statement, and then trying to reject this assumption through the data to prove she is right.

That is, you could set up the null hypothesis as H0: 100 Versus the alternative hypothesis H1: > 100. Rejecting this null hypothesis would prove the Mayor correct. ) To test the hypothesis, we can assume that the null hypothesis is true at equality, and compute the z-score of the observed mean. It is z = (76. 87 -100) / 4. 254 = -5. 44. The chance of seeing such a low value of the mean as $76. 86 if the null hypothesis is correct can be found by the command =normsdist(-5. 44), which gives the value 3E-08, or 3 x 10-8. That is, the chance of seeing such a low mean if the mayor is correct is . 00000003.

Since this is so small, the mayor must be wrong, and the average expenditure of a visitor is less than $100 per day. (If you had chosen the second form of the test, the value of the sample mean of 76. 87 would indicate that you could not reject the null hypothesis at any significance level, so the assumption that the mayor is not correct cannot be refuted. Not rejecting a null hypothesis is not as powerful a result as rejecting a null, so this second form of the test in this case is not as compelling as the first. But it is not “incorrect” to do the test this way. ) c) See the spreadsheet for the calculations.

The observed proportion is p = 0. 6. The standard error of the proportion is found as The 95% confidence interval for the proportion of Canadian visitors is thus found as 0. 6 1. 96 * 0. 35 = [. 532, . 668]. d) Let the symbol represent the actual proportion of Canadian visitors to Kingston. We want to see if we can refute Ms. Turner’s statement that > 0. 5. To do so, we could take the contrary of the Mayor’s statement, i. e. that 0. 5, as the null hypothesis, and try to refute it using the data. That is, we could express the null hypothesis as H0: 0. 5 Versus the alternative hypothesis H1: > 0. . As always, we assume that the null hypothesis is correct at equality, and establish whether the data refutes this assumption. That is, we assume = 0. 5, and ask, “How likely is it that we would have seem so large a sample p = 0. 6 if this is true? ” From part c. , our best estimate for the standard error is sp = . 035. The z-score for the observed sample proportion is z = (0. 6-. 05)/. 035 = 2. 86. The probability of seeing such a large sample p if the null hypothesis is correct can be calculated with the Excel command =1-normsdist(2. 86), which gives the value as . 0019. That is, there is a 0. 9% chance of seeing a sample proportion as large as 0. 6 if the population proportion is 0. 5. Since this is a small probability, we can reject the null hypothesis and conclude that the mayor is likely correct in her assertion. Again with this question, you could also have expressed the Mayor’s statement in the null hypothesis: that is, you could have tested H0: 0. 5 Versus the alternative hypothesis H1: < 0. 5. In this case, the observation of p = 0. 6 is entirely consistent with the null hypothesis, so you could not reject the null and would conclude again that the mayor is likely correct.

However, not rejecting the null hypothesis is again not a strong conclusion to draw since we assumed its truth at the outset. So this second form of test is not as compelling as the first one above. Question 4 a) What is the probability that a randomly selected order will require more than six days? 1 – NORMDIST(6,5. 4,1. 34,1) = . 3272, or Z=(6-5. 4)/1. 34 = 0. 448 and 1 – NORMSDIST(0. 448) = 0. 327 b) What is the probability that a randomly selected order will require between three and six days? NORMDIST(6,5. 4,1. 34,1) – NORMDIST(3, 5. 4,1. 34,1) = 0. 6362 ) We need a delivery time that is exceeded only 3% of the time (that is, 97% are less than or equal to that time). This is an ‘inverse’ normal probability question. The Excel function =NORMSINV(0. 97) indicates that the guaranteed delivery time should be about 1. 88 standard deviations above the mean value; that is 5. 4 + (1. 88)(1. 34) = 7. 92 days. In whole days, they must set their guarantee at 8 days. This will actually produce a slightly better 2. 6% late percentage, since 1-NORMDIST(8, 5. 4, 1. 34,1) = 0. 0262. d) This is a similar question to part a) except that we are now concerned with a sample of fifteen orders.

This should have much smaller variability than the time for a single order. Also, we are given the population standard deviation of 1. 34 as a known quantity, and the population is assumed normal; thus the mean will follow a normal distribution for any sample size. The standard error for the mean (X) is 1. 3415=0. 346 . The probability that the average time for fifteen orders exceeds six days is 1-NORMDIST(6,5. 4,0. 346,1) = 0. 041 which is much smaller than the 33% expected for a single order. Team Question The case is concerned with potential biases in the awarding of performance incentives in Alliance.

It is evident from the case that employees are concerned with the possibility of perceived unfairness in the way bonuses are awarded, so we should look for sources of bias in the awards. See the attached worksheet to this solution that details each of the operations carried out below. Potential sources of bias in the worksheet include performance biases by sex, biases by racial category, or biases by age. We should check to see if any of these are relevant. 1) bias by sex or race: The spreadsheet contains new columns containing only the performance indicators for females, and race:

The formula in cell G3 is =IF(B3=1,F3,””) which has the effect of copying the performance ratings of females, with B3=1, into column G, and putting a blank “ ” in the column for males. (Note there is no space between the quotation marks in the command. ). The formula in cell I3 is =AVERAGE(F3:F157) which finds the plant average performance rating, and in I5 is =AVERAGE(G3:G157). We see that there is practically no difference between the plant average of 6. 135 and the female average of 6. 146 (in fact, women have a slight edge over men here), so there appears to be no bias in performance ratings in terms of sex differences.

Similarly, the formula in cell H3 is =IF(C3=1,F3,””) which will pick out the performance ratings for minority workers only in column H. We see the racial minorities have an average performance rating of 6. 257, somewhat higher than the plant average as a whole. There may be a slight bias favouring minorities, but again it is very small. To uncover possible bias with respect to age, we could plot performance ratings against age to see if there is a systematic pattern. The plot looks like this: There appears to be no evident relationship between performance rating and age within the plant. Aside: although we will not have covered the topic of measurement of correlation between 2 variables by the time this assignment is due, a more formal check of the existence of a relationship between Rating and Age would be to calculate the correlation between them with the spreadsheet command =CORREL(D3:D157,F3:F157), which gives a value of –0. 06. As we’ll soon discover, this is very low, indicating no relationship between these variables. ) Now the central question: How can we identify the performance high flyers in each department if the assessors are not consistent from department to department?

We need some way to standardize the assessments of each departmental manager to create a cross-department comparison, to find the top plant-wide performers. One naive way would be allot, say, 3 bonuses to each department and let each manager pick his or her top 3, which would give 21 bonuses all together. But one department may have relatively more top guns than another, so this quota system wouldn’t be fair to them. There are different alternate systems, and here is one: Each department manager’s performance assessments will have some middle value (a mean), and some spread around the middle (a standard deviation).

In order to standardize performance scores for comparison purposes, we can construct the z-score for each employee, based on their relative performance compared to their own department, by the formula: Then we could sort the employees in decreasing order of their z-scores to identify the employees who have an outstanding performance record in comparison with their peers. We could take the top 20 of this list as our bonus set. The spreadsheet sorts all the employees on department and then finds the department mean and standard deviation from the sorted data.

It then finds each employee’s z-score for each department from these values, and sorts the result in order of decreasing z-score. For more completeness, we could examine what fraction of employees would exceed any given members score if the scores followed a normal distribution. To find the fraction in excess of any given z-score, the spreadsheet command =1-normsdist(z-score) gives this “tail area”. Thus, for instance, the approximate fraction of the whole employee body with a score equal to or in excess of employee number 142, who has a z-score of 1. 5943, is =1-NORMSDIST(1. 35943) or . 081; i. e. 8. 1% of employees score at or above this employee. This is shown below for the top 25 employees: You can see from this table that there is quite a gap (over 2%) in the last column between Ranks 19 and 20 in terms of the fraction above these two employees. Also, once we reach the employee at rank 22, the fraction remains steady, which indicates that employees 22 through 25 are indistinguishable: they have the same z-score. The same is true for the next eight employees.

Thus you could argue that the top 25 employees should get bonuses, since they clearly lie in the upper tail of performance scores, but that the cut-off should be at rank 25. You could also argue that the bonus be awarded to the top 33 employees on this basis. In the first case we have less than the desired 20% while in the second we are over the budget. Either approach could be defended – cutting off at 30 or 31 could not. There may be desirable side effects of this method, since employees are in a sense competing with members of their own department for bonuses and not against other supervisors who may be seen as “softer” than their own.

Leave a Reply

Your email address will not be published. Required fields are marked *