An evaluation benchmark for generative AI in security domain

2024
Download Copy BibTeX
Copy BibTeX
As computing environments become increasingly complex and distributed, the volume and complexity of security data generated across various systems have grown exponentially. Extracting useful insights from this security data is crucial for effective security analytics, anomaly detection, and threat identification. However, there is a lack of comprehensive evaluation benchmarks for assessing the performance of large language models trained on any security log dataset, hindering progress in this domain. This paper proposes a comprehensive evaluation benchmark for security data, addressing this critical gap. The benchmark is easily adoptable to any security log dataset and comprises four diverse categories of tasks: supervised evaluations, unsupervised evaluations, anomaly detection, and semantic similarity evaluations. By providing a standardized framework for evaluation, the benchmark enables objective comparison and reproducible assessment of state-of-the-art embedding models across various computing environments and security log sources.

Latest news

US, WA, Seattle
Are you seeking an environment where you can drive innovation? WW Amazon Stores Finance Science (ASFS) works to leverage science and economics to drive improved financial results, foster data backed decisions, and embed science within Finance. ASFS is focused on developing products that empower controllership, improve financial planning by understanding financial drivers, and innovate science capabilities for efficiency and scale. Our team owns sophisticated science capabilities for forecasting the WW Amazon Stores P&L, focusing on costs and the bottomline (profitability). We are looking for an outstanding Senior economist to lead new high visibility initiatives for forecasting the WW Amazon Stores P&L (focusing on costs and the bottomline). The forecasting models will be used to enable better financial planning and decision making for senior leadership up to VP level. You will build new econometric models from the ground up. The role will develop new driver based forecasting models for Retail related P&L lines that incorporate business drivers. The Sr Economist will also help generate new insights on how macroeconomic factors impact the P&L. This role will have very high visibility with senior leadership up to VP level. We prize creative problem solvers with the ability to draw on an expansive methodological toolkit to transform financial planning and decision-making through economics. The ideal candidate combines econometric acumen with strong business judgment. You have versatile modeling skills and are comfortable owning and extracting insights from data. You are excited to learn from and alongside seasoned scientists, engineers, economists, and business leaders. You are an excellent communicator and effectively translate technical findings into business action.
US, CA, East Palo Alto
The Customer Engagement Technology team leads AI/LLM-driven customer experience transformation using task-oriented dialogue systems. We develop multi-modal, multi-turn, goal-oriented dialog systems that can handle customer issues at Amazon scale across multiple languages. These systems are designed to adapt to changing company policies and invoke correct APIs to automate solutions to customer problems. Additionally, we enhance associate productivity through response/action recommendation, summarization to capture conversation context succinctly, retrieving precise information from documents to provide useful information to the agent, and machine translation to facilitate smoother conversations when the customer and agent speak different languages. Key focus areas include: 1. Task-Oriented Dialog Systems: Building reliable, scalable, and adaptive LLM-based agents for understanding intents, determining eligibilities, making API calls, confirming outcomes, and exploring alternatives across hundreds of customer service intents, while adapting to changing policies. 2. Lifelong Learning: Researching continuous learning approaches for injecting new domain knowledge while retaining the model's foundational abilities and prevent catastrophic forgetting. 3. Agentic Systems: Developing a modular agentic framework to handle multi domain conversations through appropriate system abstractions. 4. Complex Multi-turn Instruction Following: Identifying approaches to guarantee compliance with instructions that specify standard operating procedures for handling multi-turn complex scenarios. 5. Inference-Time Adaptability: Researching inference-time scaling methods and improving in-context learning abilities of custom models to enable real-time adaptability to new features, actions, or bug fixes without solely relying on retraining. 6. Context Adherence: Exploring methods to ground responses in specific customer attributes, account information, and behavioral data to prevent hallucinations and ensure high-fidelity responses. 7. Policy Grounding: Investigating techniques to align bot behavior with evolving company policies by grounding on complex, unstructured policy documents, ensuring consistent and compliant actions. 1. End to End Dialog Policy Optimization: Researching alignment approaches to optimize successful dialog completions. 2. Scalable Evaluations: Developing automated approaches to evaluate quality of experience, and correctness of agentic resolutions Key job responsibilities 1. Research and development of LLM-based chatbots and conversational AI systems for customer service applications. 2. Design and implement state-of-the-art NLP and ML models for tasks such as language understanding, dialogue management, and response generation. 3. Collaborate with cross-functional teams, including data scientists, software engineers, and product managers, to integrate LLM-based solutions into Amazon's customer service platforms. 4. Develop and implement strategies for data collection, annotation, and model training to ensure high-quality and robust performance of the chatbots. 5. Conduct experiments and evaluations to measure the performance of the developed models and systems, and identify areas for improvement. 6. Stay up-to-date with the latest advancements in NLP, LLMs, and conversational AI, and explore opportunities to incorporate new techniques and technologies into Amazon's customer service solutions. 7. Collaborate with internal and external research communities, participate in conferences and publications, and contribute to the advancement of the field.
IN, TN, Chennai
DESCRIPTION The Digital Acceleration (DA) team in India is seeking a talented, self-driven Applied Scientist to work on prototyping, optimizing, and deploying ML algorithms for solving Digital businesses problems. Key job responsibilities - Research, experiment and build Proof Of Concepts advancing the state of the art in AI & ML. - Collaborate with cross-functional teams to architect and execute technically rigorous AI projects. - Thrive in dynamic environments, adapting quickly to evolving technical requirements and deadlines. - Engage in effective technical communication (written & spoken) with coordination across teams. - Conduct thorough documentation of algorithms, methodologies, and findings for transparency and reproducibility. - Publish research papers in internal and external venues of repute - Support on-call activities for critical issues BASIC QUALIFICATIONS - Experience building machine learning models or developing algorithms for business application - PhD, or a Master's degree and experience in CS, CE, ML or related field - Knowledge of programming languages such as C/C++, Python, Java or Perl - Experience in any of the following areas: algorithms and data structures, parsing, numerical optimization, data mining, parallel and distributed computing, high-performance computing - Proficiency in coding and software development, with a strong focus on machine learning frameworks. - Understanding of relevant statistical measures such as confidence intervals, significance of error measurements, development and evaluation data sets, etc. - Excellent communication skills (written & spoken) and ability to collaborate effectively in a distributed, cross-functional team setting. PREFERRED QUALIFICATIONS - 5+ years of building machine learning models or developing algorithms for business application experience - Have publications at top-tier peer-reviewed conferences or journals - Track record of diving into data to discover hidden patterns and conducting error/deviation analysis - Ability to develop experimental and analytic plans for data modeling processes, use of strong baselines, ability to accurately determine cause and effect relations - Exceptional level of organization and strong attention to detail - Comfortable working in a fast paced, highly collaborative, dynamic work environment
CN, 11, Beijing
职位:Applied scientist 应用科学家实习生 毕业时间:2025年10月 - 2026年9月之间毕业的应届毕业生 · 入职日期:2025年6月及之前 · 实习时间:保证一周实习4-5天全职实习,至少持续5个月 · 工作地点:北京朝阳区酒仙桥路恒通商务园区 · 校招信息请参考校园招聘申请手册: https://amazonexteu.qualtrics.com/CP/File.php?F=F_55YI0e7rNdeoB6e 投递须知: 1 填写简历申请时,请把必填和非必填项都填写完整。提交简历之后就无法修改了哦! 2 学校的英文全称请准确填写。 如果您正在攻读计算机视觉、生成式AI或多模态领域的博士或硕士研究生,而且对应用科学家的实习工作感兴趣。 如果您也喜爱深入研究棘手的技术问题并提出解决方案,用成功的产品显著地改善人们的生活。 那么,我们诚挚邀请您加入亚马逊的International Technology自动化营销团队改善亚马逊节假日促销的用户体验。我们的目标是帮助亚马逊的客户找到他们所需的产品,并发现他们感兴趣的新产品。 这会是一份收获满满的工作。您每天的工作都与全球数百万亚马逊客户的体验紧密相关。您将提出和探索LLM和CV领域的创新,例如如何精准控制最前沿的基座大语言模型和图像生成模型以满足自动化的需求。您将集成这些模型到工具链中生成个性化的促销广告图,通过标注数据、建模和客户反馈来完成闭环。您对模型的选择需要能够平衡业务指标和响应时间的需求。
CN, 11, Beijing
职位:Applied scientist 应用科学家实习生 毕业时间:2025年10月 - 2026年9月之间毕业的应届毕业生 · 入职日期:2025年6月及之前 · 实习时间:保证一周实习4-5天全职实习,至少持续3个月 · 工作地点:北京朝阳区酒仙桥路恒通商务园区 · 校招信息请参考校园招聘申请手册: https://amazonexteu.qualtrics.com/CP/File.php?F=F_55YI0e7rNdeoB6e 投递须知: 1 填写简历申请时,请把必填和非必填项都填写完整。提交简历之后就无法修改了哦! 2 学校的英文全称请准确填写。 如果您正在攻读NLP,IR或搜索领域专业的博士或硕士研究生,而且对应用科学家的实习工作感兴趣。如果您也喜爱深入研究棘手的技术问题并提出解决方案,用成功的产品显著地改善人们的生活。 那么,我们诚挚邀请您加入亚马逊的International Technology搜索团队改善Amazon的产品搜索服务。我们的目标是帮助亚马逊的客户找到他们所需的产品,并发现他们感兴趣的新产品。 这会是一份收获满满的工作。您每天的工作都与全球数百万亚马逊客户的体验紧密相关。您将提出和探索NLP和IR领域的创新,基于TB级别的产品和流量数据设计机器学习模型。您将集成这些模型到搜索引擎中为客户提供服务,通过数据,建模和客户反馈来完成闭环。您对模型的选择需要能够平衡业务指标和响应时间的需求。
IL, Haifa
Come build the future of entertainment with us. Are you interested in helping shape the future of movies and television? Do you want to help define the next generation of how and what Amazon customers are watching? Prime Video is a premium streaming service that offers customers a vast collection of TV shows and movies - all with the ease of finding what they love to watch in one place. We offer customers thousands of popular movies and TV shows from Originals and Exclusive content to exciting live sports events. We also offer our members the opportunity to subscribe to add-on channels which they can cancel at any time and to rent or buy new release movies and TV box sets on the Prime Video Store. Prime Video is a fast-paced, growth business - available in over 240 countries and territories worldwide. The team works in a dynamic environment where innovating on behalf of our customers is at the heart of everything we do. If this sounds exciting to you, please read on We are seeking an exceptional Applied Scientist to join our Prime Video Sports tech team in Israel. Our team is dedicated to developing state-of-the-art science to allow for personalizing the customers’ experience and customers to seamlessly find any live event in our selection. You will have the opportunity to work on innovative, large-scale projects that push the boundaries of what's possible in sports content delivery and engagement. Your expertise will be crucial in tackling complex challenges such as temporal information retrieval, leveraging Generative AI and Large Language Models (LLMs), and building state-of-the-art recommender systems. Key job responsibilities We are looking for an Applied Scientist with domain expertise in Personalization, Information Retrieval, and Recommender Systems, or general ML to lead the development of new algorithms and end-to-end solutions. As part of our team of applied scientists and software development engineers, you will be responsible for researching, designing, developing, and deploying algorithms into production pipelines. Your role will involve working with cutting-edge technologies such as Gen AI/LLMs to enhance content discovery and search capabilities. You'll also tackle unique challenges like temporal information retrieval to improve real-time sports content recommendations. As a technologist, you will drive the publication of original work in top-tier conferences in Machine Learning and Information Retrieval. We expect you to thrive in ambiguous situations, demonstrating outstanding analytical abilities and comfort in collaborating with cross-functional teams and systems. The ideal candidate is a self-starter with the ability to learn and adapt quickly in our fast-paced environment. About the team We are the Prime Video Sports team. In September 2018 Prime Video launched its first full-scale live streaming experience to world-wide Prime customers with NFL Thursday Night Football. That was just the start. Now Amazon has exclusive broadcasting rights to major leagues like NFL Thursday Night Football, Tennis major like Roland-Garros and English Premium League to list few and are broadcasting live events across 30+ sports world-wide. Prime Video is expanding not just the breadth of live content that it offers, but the depth of the experience. This is a transformative opportunity, the chance to be at the vanguard of a program that will revolutionize Prime Video, and the live streaming experience of customers everywhere.
IL, Haifa
Come build the future of entertainment with us. Are you interested in helping shape the future of movies and television? Do you want to help define the next generation of how and what Amazon customers are watching? Prime Video is a premium streaming service that offers customers a vast collection of TV shows and movies - all with the ease of finding what they love to watch in one place. We offer customers thousands of popular movies and TV shows from Originals and Exclusive content to exciting live sports events. We also offer our members the opportunity to subscribe to add-on channels which they can cancel at any time and to rent or buy new release movies and TV box sets on the Prime Video Store. Prime Video is a fast-paced, growth business - available in over 240 countries and territories worldwide. The team works in a dynamic environment where innovating on behalf of our customers is at the heart of everything we do. If this sounds exciting to you, please read on. We are seeking an exceptional Sr. Applied Scientist to join our Prime Video Sports tech team in Israel. Our team is dedicated to developing state-of-the-art science to allow for personalizing the customers’ experience and customers to seamlessly find any live event in our selection. You will have the opportunity to work on innovative, large-scale projects that push the boundaries of what's possible in sports content delivery and engagement. Your expertise will be crucial in tackling complex challenges such as temporal information retrieval, leveraging Generative AI and Large Language Models (LLMs), and building state-of-the-art recommender systems. Key job responsibilities We are looking for a Senior Applied Scientist with domain expertise in Personalization, Information Retrieval, and Recommender Systems, or general ML to lead the development of new algorithms and end-to-end solutions. As part of our team of applied scientists and software development engineers, you will be responsible for researching, designing, developing, and deploying algorithms into production pipelines. Your role will involve working with cutting-edge technologies such as GenAI/LLMs to enhance content discovery and search capabilities. You'll also tackle unique challenges like temporal information retrieval to improve real-time sports content recommendations. As a technologist, you will drive the publication of original work in top-tier conferences in Machine Learning and Information Retrieval. We expect you to thrive in ambiguous situations, demonstrating outstanding analytical abilities and comfort in collaborating with cross-functional teams and systems. The ideal candidate is a self-starter with the ability to learn and adapt quickly in our fast-paced environment. About the team We are the Prime Video Sports team. As part of this team, you will be working on the science behind the Discovery, Personalization and Search experiences of PV Sports. In September 2018 Prime Video launched its first full-scale live streaming experience to world-wide Prime customers with NFL Thursday Night Football. That was just the start. Now Amazon has exclusive broadcasting rights to major leagues like NFL Thursday Night Football, Tennis major like Roland-Garros and English Premium League to list few and are broadcasting live events across 30+ sports world-wide. Prime Video is expanding not just the breadth of live content that it offers, but the depth of the experience. This is a transformative opportunity, the chance to be at the vanguard of a program that will revolutionize Prime Video, and the live streaming experience of customers everywhere.
CA, QC, Montreal
Amazon Games recherche un.e scientifique en apprentissage automatique sénior.e pour développer et intégrer de nouvelles approches d'apprentissage automatique (ML), d'apprentissage par renforcement (RL) et d'IA générative (Gen AI) dans nos processus de développement de jeux et dans nos expériences de jeux. Dans ce rôle, vous travaillerez en étroite collaboration avec nos studios de développement de jeux et nos équipes opérationnelles pour imaginer et développer des outils, des processus et des fonctionnalités alimentés par l'IA générative à travers Amazon Games. Chez Amazon Games, notre ambition est de créer de expériences inédites et audacieuses qui rassemblent et cultivent les communautés de joueurs et de joueuses. Notre équipe d'experts de l'industrie développe des jeux multijoueurs AAA et des propriétés intellectuelles originales, avec des équipes à Seattle, Orange County, San Diego, Montréal et Bucarest. À travers nos divisions - Studios, Publishing et Prime Gaming et en collaboration avec des partenaires externes, nous développons, publions et livrons des jeux et des expériences de contenu exceptionnelles pour les joueurs et joueuses. /// Amazon Games is seeking a highly effective Senior Machine Learning Scientist to build and integrate novel ML, RL and Generative AI (Gen AI) approaches into our game pipelines and customer experiences. In this role, you will work closely with our game development studios and operations teams to research and develop generative AI-powered tools, pipelines and features across Amazon Games. At Amazon Games, our ambition is to create bold new experiences that foster community in and around our games. Our team of game industry veterans develops AAA multiplayer games and original IPs, with teams in Seattle, Orange County, San Diego, Montreal, and Bucharest. Amazon Games, through its Studios, Publishing, and Prime Gaming divisions collaborating with external partners, aims to develop, publish, and deliver compelling AAA games and content experiences for gamers to discover. Key job responsibilities Responsabilités - Diriger la recherche, l'implémentation et la mise en production d'initiatives ambitieuses et complexes en IA/ML pour Amazon Games. - Collaborer avec les équipes de programmation, de conception et artistique pour concevoir, développer et intégrer de nouveaux outils d'IA générative dans les flux de travail des développeuses et développeurs. - Identifier et résoudre de manière proactive les problèmes qui affectent la qualité de vie des joueurs, des opérations et des autres développeurs. - Se tenir au courant et analyser les dernières avancées en matière de technologie d'IA générative, et améliorer continuellement les fonctionnalités des produits lorsque des améliorations significatives en termes de coût, d'évolutivité, de qualité ou de fonctionnalité peuvent être réalisées. - Consulter et contribuer aux évaluations d'autres services internes ou tiers de ML, RL et Gen AI qui pourraient être utilisés par le projet ou l'organisation. /// Responsibilities - Drive the research, implementation, and productionizing for ambitious and complex AI/ML initiatives for Amazon Games. - Collaborate with game team engineers, designers and artists to design, develop, and integrate new generative AI tools into developer workflows. - Proactively identify and solve problems that affect the quality of life for players, operations, and other developers. - Stay up to date with and analyze the latest advancements, in generative AI technology, and continuously improve product features where meaningful improvements in cost, scalability, quality, or functionality can be achieved. - Consult and contribute to evaluations of other internal or 3rd ML, RL and Gen AI services that could be leveraged by the project or the organization. A day in the life Une journée type - Vous vous épanouissez dans un environnement collaboratif où vos décisions ont un impact et une influence significatifs. - Vous exprimer votre passion par la création d'expériences de jeu qui ravissent les joueurs et les joueuses. - Vous proposez d'excellents flux de travail, outils et innovations de jeu à vos collègues et aux équipes de développement et recherchez constamment l'amélioration. - Vous souhaitez faire partie de quelque chose d'excitant et unique dans l'écosystème du jeu. /// A day in the life - You thrive in a collaborative environment where your decisions have significant impact and influence. - You are passionate about building game experiences that delight players. - You deliver great workflows, tools, and game innovations to your fellow developers and constantly seek improvement. - You want to be part of something exciting and unique in the gaming ecosystem. About the team À propos de l'équipe L'équipe de recherche en IA d'Amazon Games Studio se concentre sur l'innovation en intelligence artificielle dans le domaine du jeu vidéo. Notre équipe hautement qualifiée et multidisciplinaire travaille sur l'apprentissage automatique, l'apprentissage par renforcement et l'IA générative pour réinventer le développement des jeux. Nous travaillons de près avec les équipe internes et nos studios partenaires pour donner vie à leur vision créative. Notre mission est d'utiliser l'IA de manière responsable pour transformer l'expérience de jeu, enrichir les récits, et fournir aux créateurs et créatrices des outils pratiques pour optimiser leurs chaînes de production. /// About the Team The Amazon Games Studio AI Research team focuses on artificial intelligence innovation in gaming. Our highly skilled, multi-discipline team works across Machine Learning, Reinforcement Learning, and Generative AI to reimagine game development. We work closely with first-party game developers and partner studios to bring creative visions to life. Our mission is to use AI responsibly to transform gameplay experiences, enrich narratives, and provide creators with practical tools to optimize their production pipelines.
US, CA, San Diego
Amazon Games is seeking a highly effective Senior Machine Learning Scientist to build and integrate novel ML, RL and Generative AI (Gen AI) approaches into our game pipelines and customer experiences. In this role, you will work closely with our game development studios and operations teams to research and develop generative AI-powered tools, pipelines and features across Amazon Games. At Amazon Games, our ambition is to create bold new experiences that foster community in and around our games. Our team of game industry veterans develops AAA multiplayer games and original IPs, with teams in Seattle, Orange County, San Diego, Montreal, and Bucharest. Amazon Games, through its Studios, Publishing, and Prime Gaming divisions collaborating with external partners, aims to develop, publish, and deliver compelling AAA games and content experiences for gamers to discover. Key job responsibilities - Drive the research, implementation, and productionizing for ambitious and complex AI/ML initiatives for Amazon Games. - Collaborate with game team engineers, designers and artists to design, develop, and integrate new generative AI tools into developer workflows. - Proactively identify and solve problems that affect the quality of life for players, operations, and other developers. - Stay up to date with and analyze the latest advancements, in generative AI technology, and continuously improve product features where meaningful improvements in cost, scalability, quality, or functionality can be achieved. - Consult and contribute to evaluations of other internal or 3rd ML, RL and Gen AI services that could be leveraged by the project or the organization. A day in the life - You thrive in a collaborative environment where your decisions have significant impact and influence. - You are passionate about building game experiences that delight players. - You deliver great workflows, tools, and game innovations to your fellow developers and constantly seek improvement. - You want to be part of something exciting and unique in the gaming ecosystem. About the team The Amazon Games Studio AI Research team focuses on artificial intelligence innovation in gaming. Our highly skilled, multi-discipline team works across Machine Learning, Reinforcement Learning, and Generative AI to reimagine game development. We work closely with first-party game developers and partner studios to bring creative visions to life. Our mission is to use AI responsibly to transform gameplay experiences, enrich narratives, and provide creators with practical tools to optimize their production pipelines.
US, WA, Seattle
We’re working to improve shopping on Amazon using the conversational capabilities of large language models, and are searching for pioneers who are passionate about technology, innovation, and customer experience, and are ready to make a lasting impact on the industry. You'll be working with talented scientists, engineers, and technical program managers (TPM) to innovate on behalf of our customers. If you're fired up about being part of a dynamic, driven team, then this is your moment to join us on this exciting journey!