Notices


Closed Thread
 
LinkBack Thread Tools Display Modes

Old 07-31-2008, 04:55 AM   #1 (permalink)
Junior Member

Join Date: Apr 2008
Posts: 17
Rep Power: 0 RyanAbbott is on a distinguished road

Blackbaud Products
- Raiser's Edge (RE)

Duplicate Criteria

Hello,

I am currently in the process of running a duplicate report on RE but wanted to find out what criteria everyone uses to ensure you are picking up every duplicate on the database.

If you have the time please can you tell me the criteria you use for both Individual and Organisational searches.

At the moment I am searching on:
Ind - Surname=10 Org - Org Name= 40
First Name= 3 Post Code= 8
Post Code= 8

Many Thanks

Ryan
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!

Old 07-31-2008, 08:12 AM   #2 (permalink)
Crystal Reports Guru

DrewAllen's Avatar

Join Date: May 2006
Location: Philadelphia, PA
Posts: 1,168
Rep Power: 4 DrewAllen is on a distinguished road

Blackbaud Products
- Raiser's Edge (RE)
- Information Edge (TIE)
- API/VBA

It really depends on how many duplicates you have and how much staff time you have to devote to cleaning it up.

We had three separate databases before we converted to RE. We meticulously went through the two secondary databases and coded each of the records with the ID for the main database. Our conversion team callously disregarded all of that and created separate records anyhow. Needless to say, we had a massive duplicate problem as a direct result of this; everyone was busy trying to get up and running on the new system; and we had very few resources to devote to cleaning up duplicates.

We used very strict duplicate criteria, because we would have been overwhelmed otherwise.

Drew
__________________
J. Drew Allen
Children's Hospital of Philadelphia

Crystal Reports and SQL Server Consultant

It is better to live your destiny imperfectly than to live an imitation of somebody else's life with perfection.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!

Old 07-31-2008, 04:11 PM   #3 (permalink)
Frolicking Dirt Child

Allison L.'s Avatar

Join Date: Apr 2007
Location: Austin
Posts: 31
Rep Power: 0 Allison L. is on a distinguished road

Blackbaud Products
- Raiser's Edge (RE)

Quote:
We meticulously went through the two secondary databases and coded each of the records with the ID for the main database. Our conversion team callously disregarded all of that and created separate records anyhow.
__________________
Allison Landry
Donations Coordinator
Austin Humane Society
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!

Old 07-31-2008, 05:56 PM   #4 (permalink)
Member

sagespot's Avatar

Join Date: Nov 2007
Location: Des Moines, IA
Posts: 34
Rep Power: 0 sagespot is on a distinguished road

Blackbaud Products
- Raiser's Edge (RE)

Allison took the icons right out of my mouth in regards to Drew's situation.
__________________
Sage
Advancement Database and Records Manager
Grand View University
www.grandview.edu
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!

Old 08-01-2008, 08:22 AM   #5 (permalink)
annieb

andrea.bbbs's Avatar

Join Date: Dec 2006
Location: Venice, Florida
Posts: 314
Rep Power: 2 andrea.bbbs is on a distinguished road

Blackbaud Products
- Raiser's Edge (RE)

I have been going through a MAJOR house cleaning this summer - starting with a few weeks of de-duping that went way beyond running the de-dupe report. I printed out Melissa's fixit suggestions and started those too. Allison's icons don't even begin to describe this process.

I just have to keep telling myself how lucky I am to have a job....
__________________
Andrea Shlasko
Database Manager
Big Brothers Big Sisters of the Sun Coast Inc.

"Just another day in Paradise, where every meal's a banquet and every day's a holiday"
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!

Old 08-01-2008, 10:19 AM   #6 (permalink)
Member

Join Date: Nov 2007
Location: Medford, Oregon area
Posts: 50
Rep Power: 2 LCox is on a distinguished road

Blackbaud Products
- Raiser's Edge (RE)

Ryan says:
"At the moment I am searching on:
Ind - Surname=10 Org - Org Name= 40
First Name= 3 Post Code= 8
Post Code= 8"


---> Please pardon my ignorance, but I don't understand this. I'm somewhat of a geek, but at databases I'm pretty much a newbie, as they say. Nonetheless, I'd like to know more about de-duping our RE database - and while it seems there are clues in this thread, it's unfortunately as good as Greek to me.

Also, from my perspective, I don't see a real answer to the question Ryan put. Of course it's all relative - you do as much as you can find time or personnel for, but - some examples would still be helpful?

Is there someone who doesn't mind translating the above for me into something for more of a beginner to understand?
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!

Old 08-01-2008, 10:37 AM   #7 (permalink)
Courtauld Institute of Art

wolfganghario's Avatar

Join Date: Jan 2007
Location: London UK
Posts: 36
Rep Power: 0 wolfganghario is on a distinguished road

Blackbaud Products
- Raiser's Edge (RE)

Ryan is referring to the criteria used in Business Rules for running a duplicates report. You can change the criteria to check on however much or little information you deem necessary. So for example 'Ind - Surname=10 First Name= 3 Post Code= 8' means that RE will go through all your constituents and anyone who has the same 10 letters in their surname AND the same 3 letters in their first name AND the same 10 digits in their postcode/zipcode will all come out in a report for you to then go through and determine which (if any) you want to merge and delete.

You have to experiment a little with the criteria, but it all depends on how well maintained your database is. For example including postcode in your criteria won't match people up if one of them has no postcode, and J Smith won't get matched up with John Smith if you have 3 digits in the first name search criteria...etc

Hope this has helped?

Lee
__________________
Lee Harrison

Database Coodinator
Courtauld Institute of Art
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!

Old 08-01-2008, 10:46 AM   #8 (permalink)
Member

Join Date: Nov 2007
Location: Medford, Oregon area
Posts: 50
Rep Power: 2 LCox is on a distinguished road

Blackbaud Products
- Raiser's Edge (RE)

Ah, thank you, yes, very much! Occasionally I feel very lost in jargon in these forums - I know there are a lot of really knowledgeable folks here to learn from, but I can't track the conversations sometimes, and that's a novel situation for me since I'm not exactly a dummy myself.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!

Old 08-01-2008, 10:53 AM   #9 (permalink)
Junior Member

Join Date: Feb 2007
Posts: 1
Rep Power: 0 Irene Minczer is on a distinguished road

Blackbaud Products
- Raiser's Edge (RE)

Dupes

Hi Ryan,

I would leave your surname, address, postcode criteria as they are. I would change the first name or initials field to 1, as there is too high a likelihood of having the same person listed as eg "J Smith", "J H Smith" and "John Smith" and if you have more than 1 character in your first name field, the report will not show these 3 records as dupes.

Basically my method of running dupe reports is to start with a "perfect" duping, which is which you're currently running. Then when you've got rid of the majority of them, start to knock your numbers down so the report starts to pick up on less than perfect dupes, as with Mr Smith above.

So today you would run:
Surname 10, First name 1, Address 40, Org Name 40

then when you've cleared that list, or got it down to 100 or so, run:

Surname 5, First name 0, address 10, Org name 10

so you're changing the criteria to first pick up the obvious dupes, get all those merged, and then running it to pick up the ones which may or may not be dupes and will need individually checking before merging.

Hope that makes sense, Irene Minczer, YMCA
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!

Old 08-01-2008, 10:53 AM   #10 (permalink)
Crystal Reports Guru

DrewAllen's Avatar

Join Date: May 2006
Location: Philadelphia, PA
Posts: 1,168
Rep Power: 4 DrewAllen is on a distinguished road

Blackbaud Products
- Raiser's Edge (RE)
- Information Edge (TIE)
- API/VBA

Quote:
Originally Posted by wolfganghario View Post
You have to experiment a little with the criteria, but it all depends on how well maintained your database is. For example including postcode in your criteria won't match people up if one of them has no postcode, and J Smith won't get matched up with John Smith if you have 3 digits in the first name search criteria...etc
Actually, neither of these statements is correct. We had several times where a constituent with a Zip Code matched a constituent without a Zip Code. We use five characters of the Zip Code.

We also had one case where "Hart" matched "Hartwell" even though we were using ten characters for the last name.

This occurs, because the matches are programmed incorrectly. It should be taking the first n characters of the field on each record and seeing if they are equal. Instead, it is seeing if the field on one record is LIKE the first n characters of the field on the other record. Note that this is not commutative!

For example, consider taking the first five characters.

'John' LIKE 'J%' evaluates to true. '%' is the SQL wildcard for any number of characters.
'J' LIKE 'John%' evaluates to false.

In our case, we were getting a match, because
'Hartwell' LIKE 'Hart%' evaluates to true.

Drew
__________________
J. Drew Allen
Children's Hospital of Philadelphia

Crystal Reports and SQL Server Consultant

It is better to live your destiny imperfectly than to live an imitation of somebody else's life with perfection.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!

Old 08-01-2008, 11:13 AM   #11 (permalink)
Courtauld Institute of Art

wolfganghario's Avatar

Join Date: Jan 2007
Location: London UK
Posts: 36
Rep Power: 0 wolfganghario is on a distinguished road

Blackbaud Products
- Raiser's Edge (RE)

In that case I stand corrected. Thanks Drew.
Won't the duplicate query always look at 'J' LIKE 'John' rather than 'John' LIKE 'J' though, or does it not run alphabetically?

Lee
__________________
Lee Harrison

Database Coodinator
Courtauld Institute of Art
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!

Old 08-01-2008, 11:25 AM   #12 (permalink)
Crystal Reports Guru

DrewAllen's Avatar

Join Date: May 2006
Location: Philadelphia, PA
Posts: 1,168
Rep Power: 4 DrewAllen is on a distinguished road

Blackbaud Products
- Raiser's Edge (RE)
- Information Edge (TIE)
- API/VBA

Quote:
Originally Posted by wolfganghario View Post
Won't the duplicate query always look at 'J' LIKE 'John' rather than 'John' LIKE 'J' though, or does it not run alphabetically?
I haven't investigated in that much detail. If the initial is entered with a period '.' then the two strings won't match for lengths greater than one. I suspect that in many cases this is why they don't match despite the flaws in the programming.

Drew
__________________
J. Drew Allen
Children's Hospital of Philadelphia

Crystal Reports and SQL Server Consultant

It is better to live your destiny imperfectly than to live an imitation of somebody else's life with perfection.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!

Old 08-05-2008, 07:18 AM   #13 (permalink)
Member

Simon Stacey's Avatar

Join Date: Jun 2008
Location: London, UK
Posts: 67
Rep Power: 1 Simon Stacey is on a distinguished road

Blackbaud Products
- Raiser's Edge (RE)

Quote:
Originally Posted by DrewAllen View Post
This occurs, because the matches are programmed incorrectly. It should be taking the first n characters of the field on each record and seeing if they are equal. Instead, it is seeing if the field on one record is LIKE the first n characters of the field on the other record. Note that this is not commutative!
----8<----
Drew
Ouch. [sigh]

A "sounds-like" search option (using generally accepted homonym logic) might also be useful, given errors I've seen created by telephone services who don't double-check spellings ...
__________________
Simon Stacey
Database Officer
Centrepoint
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!

Old 08-20-2008, 09:23 AM   #14 (permalink)
Woman of Many Hats

vsiegel's Avatar

Join Date: Jan 2007
Location: Chicago
Posts: 56
Rep Power: 2 vsiegel is on a distinguished road

Blackbaud Products
- Raiser's Edge (RE)

I'd just like to note for posterity here that some of us LOVE doing this kind of clean-up, and we're out there, in your city, waiting to do a freelance cleanup of the database that has you hitting your head against a wall.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!

Old 08-20-2008, 09:32 AM   #15 (permalink)
Member

Simon Stacey's Avatar

Join Date: Jun 2008
Location: London, UK
Posts: 67
Rep Power: 1 Simon Stacey is on a distinguished road

Blackbaud Products
- Raiser's Edge (RE)

Quote:
Originally Posted by vsiegel View Post
I'd just like to note for posterity here that some of us LOVE doing this kind of clean-up
You're not the only one - although I'd really like to have enough time to do that, in between the regular donation processing and appeal package segmentation that takes up most of my time! I'm making a little list of cleanups that need to be done at some point ...
__________________
Simon Stacey
Database Officer
Centrepoint
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!

Old 08-21-2008, 12:05 PM   #16 (permalink)
Member

Join Date: Nov 2007
Location: Medford, Oregon area
Posts: 50
Rep Power: 2 LCox is on a distinguished road

Blackbaud Products
- Raiser's Edge (RE)

Oh, for a volunteer like that in the Medford, Oregon area... like Simon, I enjoy that type of work myself although I'm relatively new at it. Also like Simon, I've been forced to spend my time on other things, and to relegate this kind of cleanup to lists of things I'd like to see done on a regular basis.

I've also toyed with the idea of trying to get an intern in to help with things like this, among other tasks.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!

Old 08-22-2008, 03:52 AM   #17 (permalink)
Member

Simon Stacey's Avatar

Join Date: Jun 2008
Location: London, UK
Posts: 67
Rep Power: 1 Simon Stacey is on a distinguished road

Blackbaud Products
- Raiser's Edge (RE)

Quote:
Originally Posted by LCox View Post
forced to ... relegate this kind of cleanup to lists of things I'd like to see done on a regular basis.
I think that everyone who works in the charity/nfp data sector has a wishlist like that - "All the things we'd do if we had time or resources to do this properly." Rather than the more common "Aaaaaaaaaaaaaaaaagh! Got to get that done by lunchtime!" which we end up having to do.
__________________
Simon Stacey
Database Officer
Centrepoint
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!Spurl this Post!Reddit! Wong this Post!

Old 08-22-2008, 09:45 AM   #18 (permalink)
annieb