A very useful application of subsetting data is to find and remove duplicate values. R has a useful function, duplicated(), that finds duplicate values and returns a logical vector that tells you whether the specific value is a duplicate of a previous value. This means that for duplicated values, duplicated() returns FALSE for the first occurrence and TRUE for every following occurrence of that value, as in the following example:

```> duplicated(c(1,2,1,3,1,4))
[1] FALSE FALSE TRUE FALSE TRUE FALSE```

If you try this on a data frame, R automatically checks the observations (meaning, it treats every row as a value). So, for example, with the data frame iris:

```> duplicated(iris)
[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[10] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
....
[136] FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE
[145] FALSE FALSE FALSE FALSE FALSE FALSE```

If you look carefully, you notice that row 143 is a duplicate (because the 143rd element of your result has the value TRUE). You also can tell this by using the which() function:

```> which(duplicated(iris))
[1] 143```

Now, to remove the duplicate from iris, you need to exclude this row from your data. Remember that there are two ways to exclude data using subsetting:

• Specify a logical vector, where FALSE means that the element will be excluded. The ! (exclamation point) operator is a logical negation. This means that it converts TRUE into FALSE and vice versa. So, to remove the duplicates from iris, you do the following:

`> iris[!duplicated(iris), ]`
• Specify negative values. In other words:

```> index <- which(duplicated(iris))
> iris[-index, ]```

In both cases, you’ll notice that your instruction has removed row 143.