Duplicate And Unique Items In Lists

advertisement
Duplicate And Unique Items In Lists
This page describes a variety of methods you can use when working with lists of data that ma
list using Excel97's Conditional Formatting feature. Next, we'll look at a way to count the numb
unique elements that exist in a list. Finally, we'll learn how to compare two lists of data and ex
For purposes of example, suppose we have two lists of data, named Range1 and Range2. Th
not be in adjacent columns, but they must be on the same worksheet. For example, we may d
within Range1 and Range2
Testing For Duplicate Entries
If you need to determine whether a list in Excel has duplicate entries, you can use the followin
Duplicates" if the range does not have any duplicates.
=IF(MAX(COUNTIF(Range1,Range1))>1,"Duplicates","No Duplicates")
This is an array formula, so you must press Ctrl+Shift+Enter rather than just Enter when
complete range contain data. If only the first N cells contain data, and the rest are empty, the
themselves. If you want the formula to look only that the cells that contains data, use a formul
=IF(MAX(COUNTIF(INDIRECT("A2:A"&(MAX((A2:A500<>"")*ROW(A2:A500)))),IN
Duplicates")
This formula will look only that the cells from A2 down to the last cell that contains data. This i
first enter the formula, and when ever you edit it later.
Highlighting Duplicate Entries
Our first task is to highlight the cells in Range1 that are duplicates. We use Excel's Conditiona
Formatting tool to accomplish this. First, highlight the entire Range1. Then, select the
Conditional Formatting tool from the Format menu: Format->Conditional Formatting.
Change the "Cell Value Is" option to "Formula Is" and enter the following formula in the
formula text box:
=IF(COUNTIF(Range1, A5)>1,TRUE,FALSE)
Where A5 is the first cell in Range1. Then, click the Format button and select the font or
color you want your cell formatted with. Finally, click OK. Duplicate entries in Range1 will be
formatted as you selected. For example, if "Able" occurs twice in Range1, both
occurrences of "Able" will appear highlighted.
Tagging Duplicate Entries
Instead of highlighting duplicate entries, you may want to "tag" them by placing the word
"Duplicate" next to each duplicate entry in the range. This is a simple task. In an empty
column next to Range1, enter the following formula, and use Fill Down to fill the column
with the formula:
=IF(COUNTIF(Range1,???)>1,"Duplicate","")
Change ??? to the first cell of Range1. The Fill Down operation will automatically change the
you fill down. If a cell is duplicated in Range1, the word "Duplicate" will appear next to each ce
Replacing Duplicate Entries
Suppose you have a list of group data
entries. By "grouped" I mean that in the
list, all values of equal value occur
together. The groups do not need to be
sorted, but like entries must occur
together. You can use some simple VBA
code to replace all duplicate entries in each
group, except the first, with a blank string,
or any other string. For example, consider
the data in the range to the left:
The following code will replace all the duplicate entries, except the
first, in each group with the string "----". To use this code, first
select the cells (in one column) that you want to change, then run
the procedure.
Sub FixDuplicateRows()
Dim RowNdx As Long
Dim ColNum As Integer
ColNum = Selection(1).Column
For RowNdx = Selection(Selection.Cells.Count).Row
To _
Selection(1).Row + 1 Step -1
If Cells(RowNdx, ColNum).Value = Cells(RowNdx
- 1, ColNum).Value Then
Cells(RowNdx, ColNum).Value = "----"
End If
Next RowNdx
End Sub
The figure to the left shows the same data
as above, after the code has been
executed.
Counting Unique Entries In A Range
To count the number of unique entries in the list, use the following array formula:
=SUM(IF(FREQUENCY(IF(LEN(Range1)>0,MATCH(Range1,Range1,0),""),
IF(LEN(Range1)>0,MATCH(Range1,Range1,0),""))>0,1))
This will return the number of unique entries in Range1. It will not count blanks at
all.
The formula above will return 3, indicating that there
are 3 unique, non-blank entries in the range (Able,
Baker, and Cathy). Click here for more information
about Array Formulas.
If your data does not have any blanks in the range, you
can use the following Array Formula, developed by
David Hager:
=SUM(1/COUNTIF(A1:A10,A1:A10))
If your data has only numeric values or blank cells, with
no text or string values, you can use the formula
=SUM(N(FREQUENCY(A1:A10,A1:A10)>0))
to count the number of unique values. This will count
only the number of unique numeric values, not
including text values.
Download a workbook illustrating these formulas.
Extracting Unique Entries
Next, we want to extract the unique values in Range1. In some circumstances, it is
easier to use Excel's Advance Filtering capabilities to extract the unique values.
There is one drawback to this approach, however. If you change the contents of the
list, you have to manually recreate the list of filtered data. By using worksheet
functions to extract the unique entries, the extracted data is kept current whenever
you change the contents of the original list.
You can do this with a very simple array formula.
=IF(COUNTIF($A$1:A1,A1)=1,A1,"")
Enter this formula in the first cell of the range you want to contain the unique
entries. Change A1 and $A$1 to the first cell in the range containing the from which
data that you want to extract unique items. Then, use Fill Down (from the Edit
menu) to fill the formula down to as many rows as you need to hold the unique
entries (i.e., up to as many rows as there are in the original range.)
You can then transfer these values to another range of cells and eliminate the
blank entries. See "Eliminating Blank Cells" for details about how to do this.
Extracting Values Common To Two Lists
You can easily extract values that appear in both of two lists. Suppose your lists are
in A1:A10 and B1:B10. Enter the following array formula in the first cell of the
range which is to contain the common entries:
=IF(COUNTIF($A$1:$A$10,B1)>0,B1,"")
Change B1 and $A$1:$A$10 to the first cells in the ranges from which data that
you want to extract common items. Then, use Fill Down (from the Edit menu) to fill
the formula down to as many rows as you need to hold the common entries (i.e., up
to as many rows as there are in the original range.)
Extracting Values On One List And Not Another
Another frequent tasks involving lists is to create a list containing values in one
range that are not in another range. Suppose there are two lists, in A1:A10 and
B1:B10. Enter the following array formula in the first cell of the range which is to
contain the entries in B1:B10 that do not occur in A1:A10.
=IF(COUNTIF($A$1:$A$10,B1)=0,B1,"")
Change B1 and $A$1:$A$10 to the first cells in the ranges from which data that
you want to extract items. Then, use Fill Down (from the Edit menu) to fill the
formula down to as many rows as you need to hold the common entries (i.e., up to
as many rows as there are in the original range.)
You can then transfer these entries to another range of cells and eliminate the blank
entries. See "Eliminating Blank Cells" for details about how to do this.
Other Excel formulas and VBA procedures for working with
duplicate and unique entries in lists can be found on the following
pages:
Preventing Duplicates On Entry
Duplicates
Deleting Rows
Download