-
Notifications
You must be signed in to change notification settings - Fork 4
project1 multi
This page for documentation Project1: Signal::multi
There's three versions of my implementation.
First of all, main idea is aho-corasick algorithm. But my implementation is little bit different. Check from below. And process using multi thread with new thread pool.
It's duplicated in Project1::master Wiki
This for find matching patterns in various patterns from query. There is KMP Algorithm for find matching single pattern from query. But it takes O(m + zn)
in set of patterns (m
as total length of patterns, z
as count of patterns and n
as size of query). Aho-corasick can solve this problem in O(m + n + k)
(k
as number of patterns in query).
There is three major parts in Aho-corasick Algorithm.
- Construct Trie for patterns.
- Trace trie nodes and create node if there are no matches.
- Make
Failure Link
andOutput Link
-
Failure Link
for prevent looking back from the beginning. -
Output Link
ensure that works correctly even when sub string exists.
-
- Find matching
- Follow the trie nodes, find the pattern if there is an output link.
Redesign for performance.
Design Principles: Reusable, readability and meaningful code, using high level features. (e.g. auto
, lambda
)
- Redesign well organized structure
- Aho-corasick automata
- Define
Final
,Normal
andInit
states - Tracer for trace requested query from map (It's like a awesome
iterator
) - Change
Trie
structure to 2d array (maximum state size
×char size(a-z)
) - Operator for solve assignment
- Design efficient add / deletion algorithm
- Multi thread Implementation
- new Thread pool
- High compatibility with lambda with support C++14 standard
- Resolve race condition
- Minimize shared memory
- Implement thread safe queue
Process
- Make 2d array for Aho-corasick accept automata.
-
0
isInit State
, under0
isFinal State
, over0
isNormal State
(access like this) - From
Init State
, trace char of pattern and make node (see Map::_insert)
-
- Process add / deletion.
- Thread pool separate tasks to thread
- Divide as the number of worker in pool
- Each worker find match for input query(each work has different query)
- Worker return found match to own queue
- Merge all of queues after pool is empty
- Find match.
- Map::begin return new Tracer accept input query.
- Find the
Final State
by increasing the Tracer untilOut State
you can comments and make merge requests to fix my codes. 😄