Skip to content

Commit

Permalink
Add an asbsurdly long sentence
Browse files Browse the repository at this point in the history
  • Loading branch information
linas committed Feb 26, 2023
1 parent 1e7da3f commit 08a6069
Showing 1 changed file with 7 additions and 1 deletion.
8 changes: 7 additions & 1 deletion data/en/corpus-fix-long.batch
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
!short=20
!constituents=1
!spell=0
!timeout=1000
!timeout=600

% Example sentences showing BUG FIXES since link-grammar 4.1b
% as well as demonstating bugs that remain to be fixed.
Expand Down Expand Up @@ -65,3 +65,9 @@ In Buddhism there are 84 Mahasiddhas: Acinta, Ajogi, Anangapa, Aryadeva, Babhaha

% 2/2;
And yet he should be always ready to have a perfectly terrible scene, whenever we want one, and to become miserable, absolutely miserable, at a moment’s notice, and to overwhelm us with just reproaches in less than twenty minutes, and to be positively violent at the end of half an hour, and to leave us for ever at a quarter to eight, when we have to go and dress for dinner when, after that, one has seen him for really the last time, and he has refused to take back the little things he has given one, and promised never to communicate with one again, or to write one any foolish letters, he should be perfectly broken-hearted, and telegraph to one all day long, and send one little notes every half-hour by a private hansom, and dine quite alone at the club, so that every one should know how unhappy he was.

% --------------------------------------------------------------------
% Long sentence w/208 words; more-or-less guaranteed to timeout.

This comment has been minimized.

Copy link
@ampli

ampli Mar 2, 2023

Member

You wrote here "more-or-less guaranteed to timeout".
However, it t doesn't timeout in regular batch mode (which passes with 0 nulls).
It actually takes ~0.6 seconds on my computer.

(It does timeout in batch mode when nulls are allowed, or in interactive mode. But there is nothing special in that since most of the long sentences that are not fully parsable currently need significant time for parsing with nulls.)

This comment has been minimized.

Copy link
@linas

linas Mar 2, 2023

Author Member

Huh. I set timeout=3600111 and it ran overnight, with no results. I concluded that the combinatorial explosion of parsing with ANY links is more or less the same as the combinatorial explosion of parsing with nulls.

This comment has been minimized.

Copy link
@ampli

ampli Mar 2, 2023

Member

Note that this is for parsing with nulls, while a standard batch doesn't parse with nulls.
(For this specific sentence, when running in standard batch mode, no try is done to parse it since power_prune() finds a null word and the parse with 0 nulls is skipped).

I started running it interactively with -verbosity=2 -time=1000000, and so far it finished parsing with 1 null in ~60 seconds and with 2 nulls in ~230 seconds. For each additional null count, it needs to do again all the work of the previous null counts (I once implemented incremental parsing, but deleted this code because it was subtle and anyway I could not prove that pp_prune() cannot mess up the things in certain cases).

I concluded that the combinatorial explosion of parsing with ANY links is more or less the same as the combinatorial explosion of parsing with nulls.

I think this is true only if the minimum number of nulls for parsing the sentence is very big. If it is 1 or 2 the parse time is relatively short.

This comment has been minimized.

Copy link
@linas

linas Mar 3, 2023

Author Member

Perhaps I should remove this sentence? I admit, I have not measured how long the others take, if the timeout is made unlimited.

This comment has been minimized.

Copy link
@ampli

ampli Mar 3, 2023

Member

Linkages are found with 3 nulls after an additional ~120 minutes, but even with limit=1000000 there is no valid linkage (all are P.P. violations).

++++ Finished expression pruning                 0.01 seconds
++++ Built disjuncts                             0.08 seconds
++++ Eliminated duplicate disjuncts              0.04 seconds
++++ Encoded for pruning                         0.08 seconds
++++ power pruned (for 0 nulls)                  0.09 seconds
++++ Built mlink_table                           0.00 seconds
++++ power pruned (for 0 nulls)                  0.03 seconds
++++ pp pruning                                  0.00 seconds
++++ power pruned (for 0 nulls)                  0.01 seconds
++++ Built mlink_table                           0.00 seconds
++++ power pruned (for 0 nulls, extra null)      0.01 seconds
#### Skip parsing (w/0 nulls)
++++ Finished parse                              0.00 seconds
No complete linkages found.
++++ Finished expression pruning                 0.00 seconds
++++ Built disjuncts                             0.08 seconds
++++ Eliminated duplicate disjuncts              0.04 seconds
++++ Encoded for pruning (one-step)              0.11 seconds
++++ power pruned (for 1 null)                   0.10 seconds
++++ Built mlink_table                           0.00 seconds
++++ power pruned (for 1 null)                   0.03 seconds
++++ pp pruning                                  0.00 seconds
++++ power pruned (for 1 null)                   0.01 seconds
++++ Built mlink_table                           0.00 seconds
++++ power pruned (for 1 null)                   0.02 seconds
++++ Encoded for parsing                         0.01 seconds
++++ Initialized fast matcher                    0.00 seconds
++++ Counted parses (0 w/1 null)                54.05 seconds
++++ power pruned (for 2 nulls)                  0.14 seconds
++++ Built mlink_table                           0.00 seconds
++++ power pruned (for 2 nulls)                  0.02 seconds
++++ pp pruning                                  0.00 seconds
++++ power pruned (for 2 nulls)                  0.02 seconds
++++ Built mlink_table                           0.00 seconds
++++ power pruned (for 2 nulls)                  0.02 seconds
++++ Encoded for parsing                         0.01 seconds
++++ Initialized fast matcher                    0.00 seconds
++++ Counted parses (0 w/2 nulls)              237.00 seconds
++++ power pruned (for 3 nulls)                  0.11 seconds
++++ Built mlink_table                           0.00 seconds
++++ power pruned (for 3 nulls)                  0.02 seconds
++++ pp pruning                                  0.00 seconds
++++ power pruned (for 3 nulls)                  0.03 seconds
++++ Built mlink_table                           0.00 seconds
++++ power pruned (for 3 nulls)                  0.02 seconds
++++ Encoded for parsing                         0.01 seconds
++++ Initialized fast matcher                    0.01 seconds
++++ Counted parses (2147483647 w/3 nulls)     633.96 seconds
++++ Built parse set                            10.97 seconds
link-grammar: Warning: Count overflow.
Considering a random subset of 1000000 of an unknown and large number of linkages
	Failing sentence contains the following words/morphemes:
	LEFT-WALL there will be other ships I am sure , was half asleep when wrote this can you tell Mentions of Sex Cussing channeling your inner daichi is a thing trust me my former volleyball captain du did idu it she terrifying captains are either really cool or utterly no middle ground Asahi works at Starbucks bout and mai siblings which probably only makes sense to need knowledge ghost hunt whatsoever read the mau main in.u focus nerds spider in chapter one some mentioned but everyone gets screen time especially senorita Team Captain Ennoshita Chikara team shabby shigeru akaashi keiji mammoth retake cause last would hilarious TB asahi strong coffee not allowed drink murderous liberos alcohol mention underage drinking can't cook Fake tsukki/kags dating inspired by beautiful best friends who fucked up gotta love aren't fucking for change 's like Irish Alzheimer forget everything grudges regular irregular update updates yam gay day lay ray say ya ! Was edited so if want re-read little more coherent RIGHT-WALL 
++++ Postprocessed all linkages                 78.85 seconds
++++ power pruned (for 4 nulls)                  0.13 seconds
++++ Built mlink_table                           0.00 seconds
++++ power pruned (for 4 nulls)                  0.03 seconds
++++ pp pruning                                  0.00 seconds
++++ Encoded for parsing                         0.01 seconds
++++ Initialized fast matcher                    0.01 seconds

This comment has been minimized.

Copy link
@linas

linas Mar 3, 2023

Author Member

Yes, that's similar to what I saw ... or, presumably identical. This was one of the motivators to put that one into the test case, even though it is not anywhere near as fun to read as the other sentences. It dumbed down the literary style. Alas.

% Note MAX_SENTENCE is 254

There will be other ships I am sure, I was half asleep when I wrote this can you tell, Mentions of Sex, Cussing, channeling your inner daichi is a thing, trust me my former volleyball captain did it she was terrifying, volleyball captains are either really cool or utterly terrifying there is no middle ground, Asahi works at Starbucks, bokuto and mai are siblings which probably only makes sense to me, you need no knowledge of ghost hunt whatsoever to read this, the main focus is the volleyball nerds, there is a spider in chapter one, some ships are only mentioned but everyone gets screen time, especially ennoshita, Team Captain Ennoshita Chikara, team captain yahaba shigeru, team captain akaashi keiji, team captain yamamoto taketora, cause the last one would be hilarious tbh, asahi makes really really strong coffee and bokuto is not allowed to drink it, murderous liberos, alcohol mention, underage drinking mention, daichi can't cook, Fake tsukki/kags dating inspired by my beautiful best friends who really fucked up, gotta love it when you aren't the one fucking up for a change, it's like Irish Alzheimer's, you forget everything but the grudges, ~irregular updates~, yay!, Was edited so if you want to re-read it's a little more coherent

1 comment on commit 08a6069

@ampli
Copy link
Member

@ampli ampli commented on 08a6069 Mar 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I should remove this sentence? I admit, I have not measured how long the others take, if the timeout is made unlimited.

It is an interesting sentence because all its linkages (w/3 nulls) seem to be with P.P violations (but I have not tried limit>1000000).

Please sign in to comment.