-
Notifications
You must be signed in to change notification settings - Fork 816
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarification for the Indexing workflow #54
Comments
Isn't this covered by |
I guess some existing part of the protocol can be used. Let's take as example the question "When the language server should start indexing the project?". A possible answer is "The indexing should be triggered when the client sends the But, what if the client needs to send the server some configuration related to the indexing. For example, some clients may want to include/exclude some folders from the indexer. The How should the settings related to the indexing should look like? There should be some JSON object defined. Then how the server should report the indexing progress to the client? Via How should the client request a full index rebuild? I don't find an appropriate existing request/notification in the protocol. |
@kaloyan-raev I'm not so sure if the server should really try to impose one model here that will fit all language servers. As long as the protocol provides all the tools I prefer to leave the server the freedom to handle this. For example, a server may not even need to index the whole project at startup. It could just do it lazily (that's what I wanted to do with the PHP language server). When a file is opened, it is parsed. When a provideCompletion is triggered at a specific cursor position, it would resolve the type before the cursor, which might be in another file. It would resolve the file and parse it aswell, adding it to what you refer to as the "index", or simply an AST cache. On a didChange request, the AST would also get re-parsed. There would be no need to exclude files because it would only ever parse files which are actually interesting. Maybe I'm missing something, but is it really needed to "index" the whole project at startup? |
@felixfbecker what you describe may work in a local context, e.g. when you have an object already typed in and the server has to return a proposal list with the object's properties and methods. But what about the global context? Imagine you start typing a function name (or class/interface/trait/namespace name). After entering the first 1-2-3 characters you invoke the code completion. In this case the server has to return a proposal list with all global functions, variables, constants, classes, interfaces, namespaces, etc. Probably, filtered with what you have already typed in the editor. How do you build this list without parsing all the PHP files in the project, including the composer dependencies? |
@kaloyan-raev Could not you do full indexing on But documentation can be indeed better in regard to expectations after sending one or another request. BTW I think you always can add a custom UI action via your language VS Code extension to send full build request. |
@akosyakov Yes, this would be possible with the limitation of issue #55 I mentioned above. The point of opening this issue was to clarify the indexing workflow. If everything can be achieved with the existing protocol capabilities then this is perfect. But it still needs to be documented, because I already see that the few language servers available take completely different approaches. See my initial comment for details. If many languages require indexing for providing good code completion then we should have the indexing described as part of the protocol to avoid the unnecessary different custom implementations in language servers. This would make it easier to adopt language servers in different IDE, which is the whole point of having this protocol. |
@kaloyan-raev What's so bad about that different servers take different implementation approaches? If at the end they confirm to the protocol it is OK. It's even good to have such flexibility on the server side and keep IDE (tool) side simple. I agree that documentation should be improved but not in terms of implementation details, e.g. a server should index on a change file notification, but in terms of client's assumptions, e.g. if a client send a file change notification then the client expects to get diagnostics notifications for files affected by this change, basically in terms of the protocol itself. How the server computes affected files and validates them is not really matter from the client side. It can start indexing during initialization, do it on demand first time or maybe it even does not need an index at all. |
@akosyakov It's bad. To the point that it makes the idea of language server protocol useless. Let me give an example. We have two different language server implementations (
Now, let's have a tool (e.g. VS Code) that uses implementation A. If the tool switches to implementation B then indexing will stop working, because the client won't send the custom If both implementations used a common way to implement the client/server communication regarding indexing, then no change on the client would be required. Which is the great flexibility that the language server protocol gives. Any custom message on top of the protocol introduces tight coupling between the client and the server. Which makes it difficult to reuse language servers across IDEs. Let's look at the "Indexing" topic as a use case of the protocol. We should have a clear definition how this use case should utilize the protocol and identify any gaps (like #55). Then the more language servers (which implement indexing) and IDE tools follow it, the more reusable they will be. |
@kaloyan-raev you are assuming that custom messages are used, I assumed the opposite. No language server I know takes such a long time to index that a progress would be rectified, and not IDE implementation I know displays some kind of progress that comes from a custom message. So what you are asking for is actually a way for a server to show progress on indexing, which is a feature request for an extension of the protocol. As long as the protocol does not define any message, I would just let the indexing happen in the background without any progress reports. Do you have any data that backs the need for a progress event? |
Take a look again at the HvyIndustries/crane language server. I tried to refactor (PR here) the client/server communication as much as possible, but still needed to use a handful of custom messages not defined by the protocol.
Eclipse PDT, Zend Studio, PhpStorm and VSCode Crane Extension clearly display the progress of the indexing to the user. I haven't checked other IDEs. Having an index and displaying the progress of building it comes from long years of experience working with projects of average and big sizes. The index provides faster IntelliSense. However, building the index is time consuming (may take several minutes on big projects). Thus users are confused if IntelliSense does not work as expected shortly after opening the project. Displaying the indexing progress resolves this usability issue. |
@kaloyan-raev You seem to have more experience with this than me. I only know that the TypeScript language server for example takes a few seconds until ready. When you hover over a symbol in the meantime, it will display Loading... in the hover for a few seconds, then replace it with the actual result. I never found that experience confusing. I would welcome any contribution from you to php-language-server on how you would implement indexing. |
@kaloyan-raev it's bad when one customizes the protocol and assumes that everybody should confirm to his customization. It makes clients and implementations of such customized protocol useless. If I am not opposed to the idea of extending the protocol, some languages and front-ends can provide more capabilities, e.g. you are right some IDEs, like IntelliJ and Eclipse, can report about progress. But as a server implementor if i make a minimal usable version for the protocol as it is, i will cover more clients, after that i can introduce customization for clients which can handle them. The same true for a tool implementor. I do think both I have 2 questions on my mind regarding extensibility:
|
Long and great discussion! Hard to answer all aspects of the discussion so I will focus on the index case. My thinking about indexing is as follows:
|
Opened #70 |
hope i'm not too late to this discussion :) Merkle tree also used in the BitTorrent protocol. addressing the original questions: "How the client request the server to (re-)build the index."
refetch when the top-hash changes, or do as detailed in the algorithm for a more efficient refetch of the tree. "How the client tells the server which files are included and excluded for indexing." not sure i fully understand this, but i think using a hash that is derived from the contents of the file, say, will be an indicator of the files open. not sure if i missed the discussion about this but has there been any mention a different way of doing a not sure if different files that have the same contents, i.e., hash to the same value, should be delt with. e.g., a duplicate file somewhere else in the tree. how crazy does this all sound to everyone? |
Just to add another datapoint - for our languages Flow and Hack, and our workloads of many thousands of files in a project, we can't afford to do indexing on demand. Instead we spin up a singleton "language service" on the machine that keeps an index always up to date. Our LSP server merely connects to the existing language service for a given project. (If the LSP server is launched but there doesn't yet exist a currently-running machine-wide language service for that project, then the LSP server will cause one to spin up. That takes 1-20 minutes depending on project size. We certainly do need some kind of progress report in the protocol!) |
Version 3.15 will have support for progress reporting. I will close the issue as out of scope. |
Many languages require building an index database for an efficient calculation of code completion, validation, highlighting, etc.
The language server protocol currently does not describe how the communication between the client and the server should happen. Thus, every server-client pair take their own approach. For example the Crane PHP language server uses custom message requests and notifications, while the Eclipse Java language server seems to trigger the indexing as part of the initialize request.
Please, consider specifying:
The text was updated successfully, but these errors were encountered: