This repository is a fork of a wider project called matchID.
matchID aims at helping developers (and organizations) to match people's identities, a kind of dedupe library but focused mainly on people.
Two main applications :
- link a database filled with people informations to another database also filled with people informations
- remove duplicate people from a database
matchID is developed by @rhanka and @SuperKiwi who is part of the "Entrepreneur d'Intérêt Général" 2017 program, a French presidential program aiming to bring tech people into working for an administration for a 10 month period — inspired by Obama's Presidential Innovation Fellows (PIF).
To install matchID, please follow guidelines of matchID-backend and matchID-frontend.
The goal of this repository (the validation sub-project) is to provide an user-friendly interface to display matching results.
It allows users to easily visualize matchs between people.
Main used technologies are VueJs and ElasticSearch.
An elasticsearch instance running is mandatory to make this application work
- a recent version of
node
- a recent version of
npm
oryarn
- a recent version (>
5.x
) version ofelasticsearch
To install and configure elasticsearch on your server or localhost, please follow official guidelines
Note sure about the minimum required version of
elasticsearch
which is needed. We used5.x
versions to develop matchID
git clone https://github.com/eig-2017/matchID-validation.git
cd matchID-validation
yarn -OR- npm install
cp -r matchIdConfig.example matchIdConfig
And once, your configuration has been set, use yarn run dev
(or npm run dev
)
There are four different parts in the navbar:
There are also four parts in the controller :
-
Filter through results from Elasticsearch with text filtering
-
Filter through results from Elasticsearch depending on the fact that a match has been already processed or not
Please note that all data displayed here is dummy randomized data
The data table lists all different matchs found by matchID backend. Except the two columns Results and Status, all other columns can be customized.
- Results column are pre-computed results according to the score (displayed in the previous column). In this example:
- if the score is ajsonbove 55, the first checkbox (
validation_decision
) will be set totrue
- if the score is between 40 and 65, the question mark checkbox (
validation_indecision
) will be set totrue
- if the score is ajsonbove 55, the first checkbox (
- Status column is checked to true once you consider that the results (
validation_decision
andvalidation_indecision
) are correct
Your data mapping should look like this (names and types are fully customizable) :
{
"properties": {
"hashed_hexadecimal": {
"type": "text",
"store": true
},
"last_name_1": {
"type": "text",
"store": true
},
"last_name_2": {
"type": "text",
"store": true
},
"first_name_1": {
"type": "text",
"store": true
},
"first_name_2": {
"type": "text",
"store": true
},
"date_of_birth_1": {
"type": "text",
"store": true
},
"date_of_birth_2": {
"type": "text",
"store": true
},
"...": {
"...": "..."
},
"...": {
"...": "..."
},
"distance_between_birth_cities": {
"type": "long",
"store": true
},
"score": {
"type": "long",
"store": true
},
"...": {
"...": "..."
},
"...": {
"...": "..."
},
"validation_decision": {
"type": "text",
"fielddata": true
},
"validation_done": {
"type": "long",
"store": true
}
}
}
A few notes :
- Concerning
validation_decision
,validation_done
and optionalvalidation_indecision
, you can either keep field as atext
with"fielddata": true
or just set field aslong
(more in Validation) hashed_hexadecimal
will be explain in Random Hash
A row example for the previous set up would be :
hashed_hexadecimal | last_name_1 | last_name_2 | first_name_1 | first_name_2 | date_of_birth_1 | date_of_birth_2 | score | valdation_decision | validation_done |
---|---|---|---|---|---|---|---|---|---|
a2f34aeb3d777f82 | Grenier | Grenier | Martin Jorge Robert | Martin Georges | 04-03-1989 | 03-04-1989 | 68 | null | null |
If we keep the above example, your columns.json should look like this :
[
{
"field": "hashed_hexadecimal",
"label": "Hashed Id",
"display": false,
"searchable": true
},
{
"field": ["last_name_1", "last_name_2"],
"label": "Last Name",
"display": true,
"searchable": true,
"callBack": "coloredDiff"
},
{
"field": ["first_name_1", "first_name_2"],
"label": "First Name",
"display": true,
"searchable": true,
"callBack": "coloredDiff"
},
{
"field": ["date_of_birth_1", "date_of_birth_2"],
"label": "Date of Birth",
"display": true,
"searchable": true,
"callBack": "formatDate",
"appliedClass": {
"head": "head-centered",
"body": "has-text-centered"
}
},
{
"field": "distance_cities_of_birth",
"label": "Distance",
"display": true,
"searchable": false,
"callBack": "formatDistance",
"appliedClass": {
"head": "head-centered",
"body": "has-text-centered"
}
},
{
"field": "score",
"label": "Score",
"display": true,
"searchable": true,
"type": "score",
"appliedClass": {
"head": "head-centered",
"body": "has-text-centered min-column-width-100"
}
}
]
Mandatory sub-fields : field
, label
, display
, searchable
.
Some rules :
- every displayed column (means it has a column in data table) can be searchable or not.
- you can (or not) add a callBack function to every displayed column (see custom functions)
- you can (or not) add classes (that will be applied to head's cell or body's cell) - you will define your custom classes in custom.scss (see here)
- every scale of numbers that you want to display using a progress bar needs to have the following attribute :
"type": "score"
You can (and you should) set up custom functions in formatCell.js. Those functions will be used as callbacks (as defined above in Columns) for every cell in data table.
Do not forget to declare your functions inside the
export default
declaration.
You can add your custom style to columns of cells by declaring your classes inside custom.scss.
Just apply them inside appliedClass
in columns.json
config file.
Bulma is the CSS framework used
You can decide to remove validation by setting "display": false
in validation.json.
{
"display": true,
"action": {
"label": "Results",
"indecision_display": true
},
"done": {
"label": "Status"
}
}
If you enable validation, you need to define labels for both action
and done
fields.
If you set "indecision_display"
to true
, a question-mark checkbox will allow you to describe indecision concerning the match.
Let's check what's inside scores.json :
{
"column": "score",
"range": [0, 100],
"colors": {
"success": 80,
"info": 60,
"warning": 30,
"danger": 0
},
"statisticsInterval" : 10,
"preComputed": {
"decision": 55,
"indecision": [40, 65]
}
}
In this example, we have score that range from 0 to 100.
We set up colors for different range (in this example, the color of progress bar will be success
if score is above 80
, info
if score is between 60
and 80
and so on).
The preComputed field allows matchID-validation to prefill decision and indecision columns according to scores. In this example, if score is above 55, validation_decision
will be set to true ; and if score is between 40
and 65
, validation_indecision
will be set to true.
Let's check what's inside view.json :
{
"display": true,
"column_name": "View",
"fields": {
"operation": "excluded",
"names": ["distance_cities_of_birth"]
}
}
If you set display
to true, it will add a column on the left of your table with a call-to-action. When clicking it, it will open a pop-up with all informations concerning the row.
Good to know :
- you can set
operation
field toexcluded
orincluded
- all columns listed in
names
fields will be excluded/included to Json View (of course,names
can be an empty array[]
)
Let's check what's inside randomId.json :
{
"characters": "abcdef0123456789",
"length": 2,
"prefix": "*",
"suffix": "*",
"default_search_field": "hashed_hexadecimal"
}
The default_search_field
will be the default field/used to query elasticsearch.
On this field, we will search for a random 2-length
word (composed of following characters
) with *
as prefix
and suffix
. Example : *a4*
You can add several languages by customizing lang.json file.
Ctrl
+ Alt
will enable/disable the shortcuts.
Once enabled, the following ones are available :
a
will change the pre-computed decision (validation_decision
)e
will change the pre-computed indecision (validation_indecision
)i
will randomly reload the datad
will show/hide a json view of row's dataArrow UP
andArrow DOWN
allow you to move between rows