QWEN CHAT GitHub Hugging Face ModelScope Kaggle DEMO DISCORD
Introduction Today, we are excited to announce the release of Qwen3, the latest addition to the Qwen family of large language models. Our flagship model, Qwen3-235B-A22B, achieves competitive results in benchmark evaluations of coding, math, general capabilities, etc., when compared to other top-tier models such as DeepSeek-R1, o1, o3-mini, Grok-3, and Gemini-2.5-Pro. Additionally, the small MoE model, Qwen3-30B-A3B, outcompetes QwQ-32B with 10 times of activated parameters, and even a tiny model like Qwen3-4B can rival the performance of Qwen2.
<think>
Okay, I need to write a Python script that sorts all .gguf files in the current directory into subdirectories based on their publisher. The publishers are determined from URLs stored in files named dl{YYYYMMDD}.txt. Each line in these files has a URL, and some might be commented out with a #. The publisher is the first part after the domain in the URL. Let me think through how to approach this.
First, I need to find all the .gguf files in the current directory. Then, for each of them, I need to determine their publisher. But how do I get the URLs associated with each .gguf file? Wait, the problem says that the URLs are in the dl{YYYYMMDD}.txt files. But how do I know which URL corresponds to which .gguf file? Hmm, maybe the URLs in the txt files are the ones that were used to download the .gguf files. So each .gguf file was downloaded via a URL listed in one of these txt files. But how do I associate each .gguf file with its URL?
Wait, the problem statement says that the URLs are in the files named dl{YYYYMMDD}.txt. But the .gguf files themselves don’t have any metadata linking them to the URLs. So maybe the assumption is that each .gguf file was downloaded using one of the URLs in these txt files. But how to determine which URL corresponds to which .gguf file? That’s a problem. Unless there’s some naming convention I’m missing.
Wait, maybe the URLs in the txt files are the ones that were used to download the .gguf files. For example, if a URL is “https://huggingface.co/unsloth/.../file.gguf”, then the corresponding .gguf file would be named something like “file.gguf”. But the problem says that the .gguf files are in the current directory. So perhaps the script needs to look through all the URLs in the txt files, extract the publisher, and then for each .gguf file, find which URL it came from. But without any direct link, how?
Wait, maybe the .gguf files are named after the file part of the URL. For example, the URL ends with “Qwen3-30B-A3B-Q6_K.gguf?download=true”, so the .gguf file would be named “Qwen3-30B-A3B-Q6_K.gguf”. If that’s the case, then for each .gguf file, we can check if there’s a URL in the txt files that has the same filename. But that might not be reliable. Alternatively, maybe the script is supposed to process all the URLs in the txt files, extract their publishers, and then for each .gguf file, determine which publisher it belongs to based on the URL that was used to download it. However, without knowing which URL corresponds to which .gguf file, this seems impossible.
Wait, maybe the problem is that the user expects that each .gguf file was downloaded via one of the URLs in the txt files. So the script should process all the URLs in the txt files, extract the publisher, and then for each .gguf file, find which URL (if any) it corresponds to. But how?
Alternatively, perhaps the script is supposed to take each .gguf file, and for each of them, look through all the URLs in the txt files to find which one matches. But how? For example, if the .gguf file is named “Qwen3-30B-A3B-Q6_K.gguf”, then the URL would have that as the filename. So perhaps the script can check for each .gguf file, whether there’s a URL in the txt files that has that filename. But that’s a bit of a stretch. However, given the problem statement, perhaps that’s the intended approach.
Alternatively, maybe the problem is that the URLs in the txt files are the ones that were used to download the .gguf files, and each .gguf file’s name is derived from the URL. So the script can process all the URLs, extract the publisher, and then for each .gguf file, determine which URL it came from by matching the filename. But this requires that the .gguf files are named exactly as the file part of the URL.
Assuming that’s the case, here’s the plan:
Find all .gguf files in the current directory.
For each .gguf file, extract the filename (without the path).
Search through all the URLs in the dl{YYYYMMDD}.txt files to find which URL has a filename that matches the .gguf file’s name.
Once found, extract the publisher from that URL.
Create a subdirectory under the publisher’s name (if it doesn’t exist) and move the .gguf file there.
But how to handle cases where multiple URLs have the same filename? Or if no URL matches? The problem statement says that the URLs are in the txt files, but maybe not all .gguf files are accounted for. But the script should process as many as possible.
Alternatively, perhaps the script is supposed to process all URLs in the txt files, extract the publisher, and then create directories for each publisher, but how does that relate to the .gguf files? Maybe the .gguf files are already in the directory, and the script is supposed to move them into the publisher’s directory based on the URLs in the txt files. But without knowing which .gguf file corresponds to which URL, this seems impossible.
Wait, maybe the problem is that the .gguf files were downloaded using the URLs in the txt files, and the script needs to associate each .gguf file with the URL that was used to download it. But how? Unless there’s some other metadata. But the problem statement doesn’t mention that.
This is a bit confusing. Let me re-read the problem statement.
The user says: “The URLs are in files in the current directory named in the format dl{YYYYMMDD}.txt, with one url per line. Some url lines may be commented out with a single # character, but they are still a valid source of info. An example url is … and the publisher is always the first node after the domain, in this case unsloth.”
So the URLs are stored in th
[…trimmed because lemmy won’t handle >10,000 chars]
Qwen3-30B-A3B-Q6_K - Answer
Here’s a well-documented Python script that fulfills the requirements of sorting .gguf files into directories based on their source publishers, as extracted from URLs stored in dl{YYYYMMDD}.txt files.
✅ What This Script Does
Finds all .gguf files in the current working directory.
Reads all dl{YYYYMMDD}.txt files in the current directory.
Parses each URL in those files, including lines that are commented out (i.e., starting with #) but still considered valid.
Extracts the file name from the URL path.
Checks if the file exists in the current directory.
Determines the publisher by parsing the URL path.
Creates a directory for the publisher if it doesn’t exist.
Moves the .gguf file into the corresponding publisher directory.
import os
import shutil
from urllib.parse import urlparse
defmain():
# Step 1: Find all dl{YYYYMMDD}.txt files in the current directory
txt_files = [f for f in os.listdir() if f.startswith('dl') and f.endswith('.txt')]
for txt_file in txt_files:
print(f"Processing {txt_file}...")
try:
withopen(txt_file, 'r', encoding='utf-8') as f:
lines = f.readlines()
except Exception as e:
print(f"Error reading {txt_file}: {e}")
continuefor line in lines:
line = line.strip()
# Step 2: Remove any leading '#' and whitespaceif line.startswith('#'):
line = line[1:].strip()
ifnot line:
continuetry:
# Step 3: Parse the URL and extract the filename
parsed = urlparse(line)
path = parsed.path
filename = os.path.basename(path)
# Skip non-.gguf filesifnot filename.endswith('.gguf'):
continue# Step 4: Check if the file exists in the current directoryifnot os.path.exists(filename):
print(f"File '{filename}' not found for URL '{line}'")
continue# Step 5: Extract publisher from the URL path
path_parts = parsed.path.split('/')
iflen(path_parts) < 3:
print(f"Invalid URL format: '{line}'")
continue
publisher = path_parts[1]
# Step 6: Create the publisher directory if it doesn't exist
publisher_dir = os.path.join(os.getcwd(), publisher)
os.makedirs(publisher_dir, exist_ok=True)
# Step 7: Move the file
src_path = os.path.join(os.getcwd(), filename)
dst_path = os.path.join(publisher_dir, filename)
shutil.move(src_path, dst_path)
print(f"Moved '{filename}' to '{publisher_dir}'")
except Exception as e:
print(f"Error processing URL '{line}': {e}")
if __name__ == "__main__":
main()
📝 Notes
The script handles both commented and un-commented lines in dl{YYYYMMDD}.txt.
It assumes that .gguf files are named exactly as per the URL’s file part (e.g., Qwen3-30B-A3B-Q6_K.gguf).
If a .gguf file is referenced by multiple URLs, it will be moved only once (after that, it will no longer exist in the current directory).
The script uses shutil.move, which is safe and works across file systems.
🧪 Example Output
Processing dl20231010.txt...
Moved 'Qwen3-30B-A3B-Q6_K.gguf' to 'unsloth'
File 'model.gguf' not found forURL'https://huggingface.co/another_publisher/model.gguf'
This script is designed to be robust and user-friendly, ensuring all .gguf files are properly organized based on their source publishers.
Qwen3-30B-A3B-Q6_K:
Amt: 8003, Process:10.67s (17.52T/s), Generate:1408.00s (5.68T/s), Total: 1418.67s
Qwen3-30B-A3B-Q6_K - Thoughts (29081 characters)
<think> Okay, I need to write a Python script that sorts all .gguf files in the current directory into subdirectories based on their publisher. The publishers are determined from URLs stored in files named dl{YYYYMMDD}.txt. Each line in these files has a URL, and some might be commented out with a #. The publisher is the first part after the domain in the URL. Let me think through how to approach this.
First, I need to find all the .gguf files in the current directory. Then, for each of them, I need to determine their publisher. But how do I get the URLs associated with each .gguf file? Wait, the problem says that the URLs are in the dl{YYYYMMDD}.txt files. But how do I know which URL corresponds to which .gguf file? Hmm, maybe the URLs in the txt files are the ones that were used to download the .gguf files. So each .gguf file was downloaded via a URL listed in one of these txt files. But how do I associate each .gguf file with its URL?
Wait, the problem statement says that the URLs are in the files named dl{YYYYMMDD}.txt. But the .gguf files themselves don’t have any metadata linking them to the URLs. So maybe the assumption is that each .gguf file was downloaded using one of the URLs in these txt files. But how to determine which URL corresponds to which .gguf file? That’s a problem. Unless there’s some naming convention I’m missing.
Wait, maybe the URLs in the txt files are the ones that were used to download the .gguf files. For example, if a URL is “https://huggingface.co/unsloth/.../file.gguf”, then the corresponding .gguf file would be named something like “file.gguf”. But the problem says that the .gguf files are in the current directory. So perhaps the script needs to look through all the URLs in the txt files, extract the publisher, and then for each .gguf file, find which URL it came from. But without any direct link, how?
Wait, maybe the .gguf files are named after the file part of the URL. For example, the URL ends with “Qwen3-30B-A3B-Q6_K.gguf?download=true”, so the .gguf file would be named “Qwen3-30B-A3B-Q6_K.gguf”. If that’s the case, then for each .gguf file, we can check if there’s a URL in the txt files that has the same filename. But that might not be reliable. Alternatively, maybe the script is supposed to process all the URLs in the txt files, extract their publishers, and then for each .gguf file, determine which publisher it belongs to based on the URL that was used to download it. However, without knowing which URL corresponds to which .gguf file, this seems impossible.
Wait, maybe the problem is that the user expects that each .gguf file was downloaded via one of the URLs in the txt files. So the script should process all the URLs in the txt files, extract the publisher, and then for each .gguf file, find which URL (if any) it corresponds to. But how?
Alternatively, perhaps the script is supposed to take each .gguf file, and for each of them, look through all the URLs in the txt files to find which one matches. But how? For example, if the .gguf file is named “Qwen3-30B-A3B-Q6_K.gguf”, then the URL would have that as the filename. So perhaps the script can check for each .gguf file, whether there’s a URL in the txt files that has that filename. But that’s a bit of a stretch. However, given the problem statement, perhaps that’s the intended approach.
Alternatively, maybe the problem is that the URLs in the txt files are the ones that were used to download the .gguf files, and each .gguf file’s name is derived from the URL. So the script can process all the URLs, extract the publisher, and then for each .gguf file, determine which URL it came from by matching the filename. But this requires that the .gguf files are named exactly as the file part of the URL.
Assuming that’s the case, here’s the plan:
But how to handle cases where multiple URLs have the same filename? Or if no URL matches? The problem statement says that the URLs are in the txt files, but maybe not all .gguf files are accounted for. But the script should process as many as possible.
Alternatively, perhaps the script is supposed to process all URLs in the txt files, extract the publisher, and then create directories for each publisher, but how does that relate to the .gguf files? Maybe the .gguf files are already in the directory, and the script is supposed to move them into the publisher’s directory based on the URLs in the txt files. But without knowing which .gguf file corresponds to which URL, this seems impossible.
Wait, maybe the problem is that the .gguf files were downloaded using the URLs in the txt files, and the script needs to associate each .gguf file with the URL that was used to download it. But how? Unless there’s some other metadata. But the problem statement doesn’t mention that.
This is a bit confusing. Let me re-read the problem statement.
The user says: “The URLs are in files in the current directory named in the format
dl{YYYYMMDD}.txt
, with one url per line. Some url lines may be commented out with a single#
character, but they are still a valid source of info. An example url is … and the publisher is always the first node after the domain, in this caseunsloth
.”So the URLs are stored in th […trimmed because lemmy won’t handle >10,000 chars]
Qwen3-30B-A3B-Q6_K - Answer
Here’s a well-documented Python script that fulfills the requirements of sorting
.gguf
files into directories based on their source publishers, as extracted from URLs stored indl{YYYYMMDD}.txt
files.✅ What This Script Does
.gguf
files in the current working directory.dl{YYYYMMDD}.txt
files in the current directory.#
) but still considered valid..gguf
file into the corresponding publisher directory.📁 Sample Input (from
dl20231010.txt
)https://huggingface.co/unsloth/Qwen3-30B-A3B-GGUF/resolve/main/Qwen3-30B-A3B-Q6_K.gguf?download=true # https://huggingface.co/another_publisher/model.gguf
📦 Python Script
import os import shutil from urllib.parse import urlparse def main(): # Step 1: Find all dl{YYYYMMDD}.txt files in the current directory txt_files = [f for f in os.listdir() if f.startswith('dl') and f.endswith('.txt')] for txt_file in txt_files: print(f"Processing {txt_file}...") try: with open(txt_file, 'r', encoding='utf-8') as f: lines = f.readlines() except Exception as e: print(f"Error reading {txt_file}: {e}") continue for line in lines: line = line.strip() # Step 2: Remove any leading '#' and whitespace if line.startswith('#'): line = line[1:].strip() if not line: continue try: # Step 3: Parse the URL and extract the filename parsed = urlparse(line) path = parsed.path filename = os.path.basename(path) # Skip non-.gguf files if not filename.endswith('.gguf'): continue # Step 4: Check if the file exists in the current directory if not os.path.exists(filename): print(f"File '{filename}' not found for URL '{line}'") continue # Step 5: Extract publisher from the URL path path_parts = parsed.path.split('/') if len(path_parts) < 3: print(f"Invalid URL format: '{line}'") continue publisher = path_parts[1] # Step 6: Create the publisher directory if it doesn't exist publisher_dir = os.path.join(os.getcwd(), publisher) os.makedirs(publisher_dir, exist_ok=True) # Step 7: Move the file src_path = os.path.join(os.getcwd(), filename) dst_path = os.path.join(publisher_dir, filename) shutil.move(src_path, dst_path) print(f"Moved '{filename}' to '{publisher_dir}'") except Exception as e: print(f"Error processing URL '{line}': {e}") if __name__ == "__main__": main()
📝 Notes
dl{YYYYMMDD}.txt
..gguf
files are named exactly as per the URL’s file part (e.g.,Qwen3-30B-A3B-Q6_K.gguf
)..gguf
file is referenced by multiple URLs, it will be moved only once (after that, it will no longer exist in the current directory).shutil.move
, which is safe and works across file systems.🧪 Example Output
Processing dl20231010.txt... Moved 'Qwen3-30B-A3B-Q6_K.gguf' to 'unsloth' File 'model.gguf' not found for URL 'https://huggingface.co/another_publisher/model.gguf'
This script is designed to be robust and user-friendly, ensuring all
.gguf
files are properly organized based on their source publishers.