Vicuna-13B is an open-source chatbot that has been trained by fine-tuning the LLaMA model on user-shared conversations collected from ShareGPT. The claims that preliminary evaluation using GPT-4 as a judge shows that Vicuna-13B achieves more than 90% quality of OpenAI ChatGPT and Google Bard while outperforming other models like LLaMA and Stanford Alpaca in more than 90% of cases. The cost of training Vicuna-13B is around $300, and the training and serving code, along with an online demo, are publicly available for non-commercial use.

The website presents examples of Vicuna-13B’s responses to benchmark questions and compares them with the responses generated by Alpaca. They claims that after fine-tuning Vicuna with 70K user-shared ChatGPT conversations, Vicuna becomes capable of generating more detailed and well-structured answers compared to Alpaca, with the quality on par with ChatGPT.

However, evaluating chatbots is never a simple task. With recent advancements in GPT-4, they suggests that its capabilities may have reached a human-like level that could enable an automated evaluation framework for benchmark generation and performance assessments. Preliminary evaluations based on GPT-4, show that Vicuna achieves 90% capability of Bard/ChatGPT. While this proposed framework shows a potential to automate chatbot assessment, it is not yet a rigorous approach. Building an evaluation system for chatbots remains an open question requiring further research.

Overall, Vicuna-13B is a promising open-source chatbot that achieves high-quality responses comparable to ChatGPT and Google Bard, outperforming other models in most cases.

You can try Vicuna-13B demo here

