Crime reports are used to find criminals, prevent further violations, identify problems causing crimes and allocate government resources. Unfortunately, many crimes go unreported. The National Crime Victimization Survey (NCVS) comprises data about incidents, victims, suspects and if the incident was reported or not. Current research using the NCVS is limited to statistical techniques resulting in a limited ‘view’ of the data. Our goal is to use decision trees to predict when crime is reported or not. We compare decision trees that are built based on domain knowledge with those created with three variable selection methods. We conclude that using decision trees leads to the discovery of several new variables to research further.
TopIntroduction
The financial loss due to violent and personal crimes in 2004 was $15.85 billion (Sedgwick, 2006) and 57.5% of these crimes were not reported to the police (BJS, 2005). Other costs of unreported crimes include counseling costs, alarms, electronic surveillance equipment and indirect costs such as insurance and taxes (Sedgwick, 2006). An ongoing nationwide survey has been in use since 1973 in order to better understand both reported and unreported crimes. The National Crime Victimization Survey (NCVS) is used to gather data on injury, theft, damage, the amount of lost work and other characteristics of the incident, victim and suspect. One of the goals of the NCVS is to understand the quantity of crimes and crime types that are not reported to the police (BJS, 2005). Each year, 45,000 households are interviewed about past incidents where they were the victim and the NCVS is the main source of data on the characteristics of criminal victimizations (NACJD, 2006). In addition, it also describes crime types not reported to law enforcement and the characteristics of violent offenders (NACJD, 2006).
The NCVS classifies each incident as a personal or property crime. Personal crimes include rape, sexual attack, robbery, assault and purse snatching. Property crimes include burglary, theft and vandalism. For example in 2005, 51% of personal crimes and 59% of property crimes were not reported (BJS, 2006a). Table 1 shows the large number of personal crimes, by crime type, in 2005 and whether or not they were reported. There were a significant percentage of crimes that are not reported.
Table 1. Number of victimizations, by crime type and whether or not reported (BJS, 2005)
| | Percentage Reported |
Crime Type | Number of Victimizations | Yes | No | Unknown |
Completed Violence | 1,658,660 | 62 | 37 | 1 |
Attempted/Threatened Violence | 3,515,060 | 41 | 57 | 2 |
Rape/Sexual Assault | 191,670 | 38 | 62 | 0 |
Crimes of Violence | 5,365,390 | 47 | 51 | 2 |
Completed robbery | 415,320 | 61 | 39 | 1 |
Attempted robbery | 209,530 | 36 | 64 | 0 |
Robbery | 624,850 | 52 | 47 | 1 |
Aggravated | 1,052,260 | 62 | 37 | 1 |
Simple | 3,304,930 | 42 | 55 | 2 |
Assault | 4,357,190 | 47 | 51 | 2 |
Completed purse snatching | 43,550 | 51 | 49 | 0 |
Attempted purse snatching | 3,260 | 0 | 100 | 0 |
Pocket picking | 180,260 | 32 | 67 | 2 |
Purse snatching/Pocket picking | 227,070 | 35 | 64 | 1 |