Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Context for Archive File Formats #335

Open
CodingKoalaGeneral opened this issue Oct 1, 2024 · 3 comments
Open

Context for Archive File Formats #335

CodingKoalaGeneral opened this issue Oct 1, 2024 · 3 comments
Labels
enhancement New feature or request working on Featurers that are actively being worked on

Comments

@CodingKoalaGeneral
Copy link

CodingKoalaGeneral commented Oct 1, 2024

Describe the solution you'd like
Attachment Feature (incl. Drag and Drop Context) for Archive File Formats, decompressing them and adding each file of code / document within there as conversation context

Additional context

Extend the drag-and-drop file attachment feature in the Alpaca project to support archive file formats (like .zip, .tar, .gz, etc.),

1. Existing File Handling

  • Identify where files are currently processed in the code. This is likely managed in the "src" directory, specifically in the functions that handle drag-and-drop or file selection features.
  • Ensure that the drag-and-drop event listeners can recognize and differentiate file types. Archive formats like .zip, .tar, etc., will require additional processing to extract and handle their contents.

2. Modifying Drag-and-Drop Handling

Drag-and-drop functionality is typically implemented using event listeners that handle the dropped file. You'll want to modify or extend these listeners to detect archive file types.

For example:

def handle_file_drop(file_path):
    if file_path.endswith('.zip') or file_path.endswith('.tar.gz'):
        extract_archive(file_path)
    else:
        # Handle other file types (like plain text or images)
        pass

3. Adding Support for Archive Extraction

To handle archive files, you'll need to add functionality that extracts them upon being dropped. Python has several libraries like zipfile, tarfile, and shutil that can manage this.

Here’s a basic implementation for handling .zip files:

import zipfile
import tarfile

def extract_archive(file_path):
    if file_path.endswith('.zip'):
        with zipfile.ZipFile(file_path, 'r') as zip_ref:
            zip_ref.extractall('/desired/extraction/path')
    elif file_path.endswith('.tar.gz') or file_path.endswith('.tgz'):
        with tarfile.open(file_path, 'r:gz') as tar_ref:
            tar_ref.extractall('/desired/extraction/path')

Once extracted, you can further process the contents, for example, displaying them in the UI or loading them for other purposes.

4. Integrating Archive Support into the UI

After extracting the files, you’ll want to update the UI to reflect the newly available files. If Alpaca uses GTK4 (as it seems), you can update the file list or content viewer by refreshing the relevant widgets with the extracted files.

5. Testing and Error Handling

include error handling for invalid archives or failed extractions. This can include try-except blocks around the extraction process to catch any exceptions and notify the user if something goes wrong.

6. Documentation

Update any relevant documentation or help text within the Alpaca project to reflect the new functionality for supporting archive files.

@CodingKoalaGeneral CodingKoalaGeneral added the enhancement New feature or request label Oct 1, 2024
@Jeffser
Copy link
Owner

Jeffser commented Oct 8, 2024

Hi thanks for the suggestion, it should be possible, I'll look into it

@Jeffser Jeffser added the working on Featurers that are actively being worked on label Oct 8, 2024
@CodingKoalaGeneral
Copy link
Author

Please check out

https://github.com/nomic-ai/gpt4all

They have a feature called 'LocalDocs,' which allows context file archives to be easily enabled when needed. This feature is optimizing use cases involving specific context like source code, PDF documents, and more.

@CodingKoalaGeneral
Copy link
Author

@Jeffser , thanks for the updates so far. In the meantime, I encountered some issues with Ollama not utilizing the GPU, though Alpaca continued to work fine. :)

When I have more spare time and my setup is solid, I plan to create a branch for the feedback loop feature, which seems promising for code execution (automated bug fixing) with the challenge of introducing compiler dependencies. Additionally, I aim to support the context archive feature.

I recently upgraded my laptop’s RAM to 96GB to test 70b/72b models effectively. However, I’ll need a new motherboard for my tower to handle the 405b models.

My tests with ollama, using a Python script to call the API and implement a basic feedback loop model, have shown promising results in code correction and bug fixing. However, models below 70b struggle to solve complex, text-based human / tech problems without guidance from internet sources or user intervention. By comparison, more advanced commercial AI models already seem to incorporate such features.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request working on Featurers that are actively being worked on
Projects
None yet
Development

No branches or pull requests

2 participants