Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimise: stores char data as bytes instead of their code representation #107

Merged
merged 1 commit into from
Oct 12, 2024

Conversation

Samy-33
Copy link
Contributor

@Samy-33 Samy-33 commented Oct 12, 2024

No description provided.

Comment on lines 159 to 160
// First two lex tokens are invalid characters i.e. \ne and \apple
for(int _ignored = 1; _ignored <= 2; _ignored++)
Copy link
Member

@jeaye jeaye Oct 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two stylistic points.

  1. We always use /* */ comments in jank. We also use correct grammar and punctuation, since comments are documentation. So we'll want a period at the end of this.
  2. This for loop is more idiomatically written like this:
for(size_t i{}; i < 2; ++i)

In particular, a few things change:

  1. We use size_t because we're counting up from 0
  2. We use direct initialization, as we do in modern C++
  3. We zero-base our indices/loop counters
  4. We name the counter i, even if it's ignored
  5. We use ++i instead of i++; while there's no practical difference here, the postfix increment has serious performance implications in other scenarios and we want to only use it when we actually want postfix; here, we actually want prefix increment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you point me to a resource for why i++ would have perf implications?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ++ operator in C++ is overloadable. We do this regularly for custom iterator types. For more complex iterator types, copying the iterator may be very slow. i++ requires copying i so it can be returned and then incrementing the actual i. So, imagine an iterator which holds a vector and an index in it. By doing i++, you need to copy the iterator, which means deep copying the entire vector. If you use ++i, the iterator is updated in place.

As I said, in this situation, the compiler will treat i++ to mean ++i because clang knows that people like to write i++ when they mean ++i. But C++ is all about building good habits so that you don't get bitten. By using ++i as a default, and only i++ when you actually want a copy of the value before it was incremented, you won't be surprised down the road.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Samy-33 Samy-33 force-pushed the perf/optimize-char-runtime-data branch from 47b3350 to ca45c72 Compare October 12, 2024 19:44
@Samy-33 Samy-33 requested a review from jeaye October 12, 2024 19:47
@jeaye jeaye merged commit a45f306 into jank-lang:main Oct 12, 2024
2 of 4 checks passed
@jeaye
Copy link
Member

jeaye commented Oct 12, 2024

Nice work, Saket!

@Samy-33 Samy-33 deleted the perf/optimize-char-runtime-data branch October 12, 2024 20:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants