Large Language Models (LLMs): an Ontological Leap in AI

Posted: December 27th, 2022 | Author: Domingo | Filed under: Artificial Intelligence, Natural Language Processing | Tags: AI, artificial intelligence, Large Language Models, LLMs, natural language processing, NLP | Comments Off

More than the quasi-human interaction and the practically infinite use cases that could be covered with it, OpenAI’s ChatGPT has provided an ontological jolt of a depth that transcends the realm of AI itself.

Large language models (LLMs), such as GPT-3, YUAN 1.0, BERT, LaMDA, Wordcraft, HyperCLOVA, Megatron-Turing Natural Language Generation, or PanGu-Alpha represent a major advance in artificial intelligence and, in particular, toward the goal of human-like artificial general intelligence. LLMs have been called foundational models; i.e., the infrastructure that made LLMs possible –the combination of enormously large data sets, pre-trained transformer models, and the requirement of significant computing power– is likely to be the basis for the first general purpose AI technologies.

In May 2020, OpenAI released GPT-3 (Generative Pre-trained Transformer 3), an artificial intelligence system based on deep learning techniques that can generate text. This analysis is done by a neural network, each layer of which analyzes a different aspect of the samples it is provided with; e.g., meanings of words, relations of words, sentence structures, and so on. It assigns arbitrary numerical values to words and then, after analyzing large amounts of texts, calculates the likelihood that one particular word will follow another. Amongst other tasks, GPT-3 can write short stories, novels, reportages, scientific papers, code, and mathematical formulas. It can write in different styles and imitate the style of the text prompt. It can also answer content-based questions; i.e., it learns the content of texts and can articulate this content. And it can grant as well concise summaries of lengthy passages.

OpenAI and the likes endow machines with a structuralist equipment: a formal logical analysis of language as a system in order to let machines participate in language. GPT-3 and other transformer-based language models stand in direct continuity with the linguist Saussure’s work: language comes into view as a logical system to which the speaker is merely incidental. These LLMs give rise to a new concept of language, implicit in which is a new understanding of human and machine. OpenAI, Google, Facebook, or Microsoft effectively are indeed catalyzers, which are triggering a disruption in the old concepts we have been living by so far: a machine with linguistic capabilities is simply a revolution.

Nonetheless, critiques have appeared as well against LLMs. The usual one is that no matter how good they may appear to be at using words, they do not have true language; based on the primeval seminal trailblazing work from the philologist Zipf, criticism have stated they are just technical systems made up of data, statistics, and predictions.

According to the linguist Emily Bender, “a language model is a system for haphazardly stitching together sequences of linguistic forms it has observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning: a stochastic parrot. Quite the opposite we, human beings, are intentional subjects who can make things into objects of thought by inventing and endowing meaning.“

Machine learning engineers in companies like OpenAI, Google, Facebook, or Microsoft have experimentally established a concept of language at the center of which does not need to be the human. According to this new concept, language is a system organized by an internal combinatorial logic that is independent from whomever speaks (human or machine). They have undermined one of the most deeply rooted axioms in Western philosophy: humans have what animals and machines do not have, language and logos.

Some data: monthly, on average, humans publish about seventy million posts on the content management platform WordPress. Humans produce about fifty-six billion words a month, or 1.8 billion words a day on this content management platform. GPT-3 -before its scintillating launch- was producing around 4.5 billion words a day, more than twice what humans on WordPress were doing collectively. And that is just GPT-3; there are other LLMs. We are exposed to a flood of non-human words. What will it mean to be surrounded by a multitude of non-human forms of intelligence? How can we relate to these astonishingly powerful content-generator LLMs? Do machines require semantics or even a will to communicate with us?

These are philosophical questions that cannot be just solved with an engineering approach. The scope is much wider and the stakes are extremely high. LLMs can, as well as master and learn our human languages, make us reflect and question ourselves about the nature of language, knowledge, and intelligence. Large language models illustrate, for the first time in the history of AI, that language understanding can be decoupled from all the sensorial and emotional features we, human beings, share with each other. Gradually, it seems we are entering eventually a new epoch in AI.

Explainable Artificial Intelligence: A Main Foundation in Human-centered AI

Posted: March 30th, 2022 | Author: Domingo | Filed under: Artificial Intelligence, Human-centered explainable AI | Tags: AI, artificial intelligence, Explainable AI, HCXAI, Human-centered Explainable AI | Comments Off

Human-centered explainable AI (HCXAI) is an approach that puts the human at the center of technology design. It develops a holistic understanding of �who� the human is by considering the interplay of values, interpersonal dynamics, and the socially situated nature of AI systems.

Explainable AI (XAI) refers to artificial intelligence -and particularly machine learning techniques- that can provide human-understandable justification for their output behavior. Much of the previous and current work on explainable AI has focused on interpretability, which can be viewed as a property of machine-learned models that dictates the degree to which a human user�AI expert or non-expert user�can come to conclusions about the performance of the model given specific inputs.

An important distinction between interpretability and explanation generation is that explanation does not necessarily elucidate precisely how a model works, but aims to provide useful information for practitioners and users in an accessible manner. The challenges of designing and evaluating “black-boxed” AI systems depends crucially on �who� the human in the loop is. Understanding the �who� is crucial because it governs what the explanation requirements are. It also scopes how the data are collected, what data can be collected, and the most effective way of describing the why behind an action.

Explainable AI (XAI) techniques can be applied to AI blackbox models in order to obtain post-hoc explanations, based on the information that they grant. For Pr. Dr. Corcho, rule extraction belongs to the group of post-hoc XAI techniques. This group of techniques are applied over an already trained ML model -generally a blackbox one- in order to explain the decision frontier inferred by using the input features to obtain the predictions. Rule extraction techniques are further differentiated into two subgroups: model specific and model-agnostic. Model specific techniques generate the rules based on specific information from the trained model, while model-agnostic ones only use the input and output information from the trained model, hence they can be applied to any other model. Post-hoc XAI techniques in general are then differentiated depending on whether they provide local explanations -explanations for a particular data point- or global ones -explanations for the whole model. Most rule extraction techniques have the advantage of providing explanations for both cases at the same time.

The researchers Carvalho, Pereira, and Cardoso have defined a taxonomy of properties that should be considered in the individual explanations generated by XAI techniques:

Accuracy: It is related to the usage of the explanations to predict the output using unseen data by the model.

Fidelity: It refers to how well the explanations approximate the underlying model. The explanations will have high fidelity if their predictions are constantly similar to the ones obtained by the blackbox model.

Consistency: It refers to the similarity of the explanations obtained over two different models trained over the same input data set. High consistency appears when the explanations obtained from the two models are similar. However, a low consistency may not be a bad result since the models may be extracting different valid patterns from the same data set due to the ��Rashomon Effect�� -seemingly contradictory information is fact telling the same from different perspectives.

Stability: It measures how similar the explanations obtained are for similar data points. Opposed to consistency, stability measures the similarity of explanations using the same underlying model.

Comprehensibility: This metric is related to how well a human will understand the explanation. Due to this, it is a very difficult metric to define mathematically, since it is affected by many subjective elements related to human�s perception such as context, background, prior knowledge, etc. However, there are some objective elements that can be considered in order to measure ��comprehensibility��, such as whether the explanations are based on the original features (or based on synthetic ones generated after them), the length of the explanations (how many features they include), or the number of explanations generated (i.e. in the case of global explanations).

Certainty: It refers to whether the explanations include the certainty of the model about the prediction or not (i.e. a metric score).

Importance: Some XAI methods that use features for their explanations include a weight associated with the relative importance of each of those features.

Novelty: Some explanations may include whether the data point to be explained comes from a region of the feature space that is far away from the distribution of the training data. This is something important to consider in many cases, since the explanation may not be reliable due to the fact that the data point to be explained is very different from the ones used to generate the explanations.

Representativeness: It measures how many instances are covered by the explanation. Explanations can go from explaining a whole model (i.e. weights in linear regression) to only be able to explain one data point.

In the realm of psychology there are three kinds of views of explanations:

The formal-logical view: an explanation is like a deductive proof, given some propositions.
The ontological view: events � state of affairs � explain other events.
The pragmatic view: an explanation needs to be understandable by the ��demander��.

Explanations that are sound from a formal-logical or ontological view, but leave the demander in the dark, are not considered good explanations. For example, a very long chain of logical steps or events (e.g. hundreds) without any additional structure can hardly be considered a good explanation for a person, simply because he or she will lose track.

On top of this, the level of explanation refers to whether the explanation is given at a high-level or more detailed level. The right level depends on the knowledge and the need of the demander: he or she may be satisfied with some parts of the explanation happening at the higher level, while other parts need to be at a more detailed level. The kind of explanation refers to notions like causal explanations and mechanistic explanations. Causal explanations provide the causal relationship between events but without explaining how they come about: a kind of ��why�� question. For instance, smoking causes cancer. A mechanistic explanation would explain the mechanism whereby smoking causes cancer: a kind of ��how�� question.

As said, a satisfactory explanation does not exist by itself, but depends on the demander�s need. In the context of machine learning algorithms, several typical demanders of explainable algorithms can be distinguished:

Domain experts: those are the ��professional�� users of the model, such as medical doctors who have a need to understand the workings of the model before they can accept and use the model.

Regulators, external and internal auditors: like the domain experts, those demanders need to understand the workings of the model in order to certify its compliance with company policies or existing laws and regulations.

Practitioners: professionals that use the model in the field where they take users� input and apply the model, and subsequently communicate the result to the users� situations, such as for instance loan applications.

Redress authorities: the designated competent authority to verify that an algorithmic decision for a specific case is compliant with the existing laws and regulations.

Users: people to whom the algorithms are applied and that need an explanation of the result.

Data scientists, developers: technical people who develop or reuse the models and need to understand the inner workings in detail.

�
Summing up, for explainable AI to be effective, the final consumers (people) of the explanations need to be duly considered when designing HCXAI systems. AI systems are only truly regarded as “working” when their operation can be narrated in intentional vocabulary, using words whose meaning go beyond the mathematical structures. When an AI system “works” in this broader sense, it is clearly a discursive construction, not just a mathematical fact, and the discursive construction succeeds only if the community assents.

To Overcome the Reluctance for Accepting AI, We Must Highlight the Gains in Terms of Productivity and Efficiency, Using Plain Language.

Posted: January 28th, 2021 | Author: Domingo | Filed under: Artificial Intelligence, Interviews | Tags: AI, artificial intelligence, expert.ai, hAItta | Comments Off

As a welcome for the allocated seats in the Redesigning Financial Services Strategic Steering Committee, expert.ai’s Chief Operating Officer Gabriele Donino, and the Managing Director Switzerland Domingo Senise de Gracia were interviewed to talk about the use of artificial intelligence, the potentials, opportunities and barriers.

Link to the interview.

Inteligencia artificial para luchar contra el blanqueo de capitales

Posted: January 17th, 2021 | Author: Domingo | Filed under: Artificial Intelligence | Tags: AI, AML, Anti-money Laundering, artificial intelligence, Blanqueo de capitales, ia, inteligencia artificial | Comments Off

El blanqueo de capitales se define legalmente como la transferencia de dinero obtenido ilegalmente a través de personas o cuentas legítimas, de manera que no se pueda rastrear su fuente original.

El Fondo Monetario Internacional (FMI) estima que el tamaño agregado del blanqueo de capitales en todo el mundo es de aproximadamente 3,2 billones de dólares, o el 3% del PIB mundial. Los beneficios del blanqueo de capitales se utilizan a menudo para financiar delitos, como el terrorismo, la trata de personas, el tráfico de drogas y la venta ilegal de armas. Los bancos y otro tipo de instituciones financieras implementan sistemas contra el blanqueo de capitales. No cumplir con las normas de lucha contra el blanqueo de capitales es un tipo de delito corporativo, que significa un serio riesgo para la reputación de estas instituciones financieras. A pesar de los esfuerzos actuales, varias instituciones financieras multinacionales han sido objeto de fuertes multas por parte de los reguladores de la lucha contra el blanqueo de capitales, por la ineficacia de sus prácticas en los últimos años.

La introducción de la inteligencia artificial con el propósito de luchar contra el blanqueo de capitales mejora y facilita el proceso general de toma de decisiones, al tiempo que se mantiene el cumplimiento de políticas como el Reglamento General de Protección de Datos. La IA puede reducir al mínimo el número de transacciones falsamente etiquetadas como sospechosas, lograr una calidad demostrable de cumplimiento de las expectativas reglamentarias, y mejorar la productividad de los recursos operacionales.

La colocación, la diversificación y la integración son las tres fases en los procesos de blanqueo de capitales. En la fase de colocación el producto de las actividades delictivas se convierte en instrumentos monetarios o se deposita de otro modo en una institución financiera (o ambas situaciones). La diversificación se refiere a la transferencia de fondos a otras instituciones financieras o personas mediante transferencias electrónicas, cheques, giros postales u otros métodos. En la fase final de integración, los fondos se utilizan para adquirir activos legítimos o seguir financiando empresas delictivas. En este caso, el dinero obtenido ilegalmente pasa a formar parte de la economía legítima. Los enfoques de inteligencia artificial pueden aplicarse para identificar las actividades de blanqueo de capitales en cada una de las tres fases mencionadas. Pueden utilizarse métodos comunes de aprendizaje automático como las máquinas de vectores de soporte (support vector machines, según su denominación en inglés), y los bosques aleatorios (random forests, según su denominación en inglés), a fin de clasificar las transacciones fraudulentas utilizando grandes conjuntos de datos bancarios anotados.

En la actualidad, los esquemas típicos en la lucha contra el blanqueo de capitales pueden descomponerse en cuatro capas. La primera capa es la capa de datos, en la que se produce la recogida, gestión y almacenamiento de los datos relevantes. Esto incluye tanto los datos internos de la institución financiera como los datos externos de fuentes como agencias reguladoras, autoridades y listas de vigilancia. La segunda capa, la capa de control y vigilancia, examina las transacciones y los clientes en busca de actividades sospechosas. Esta capa ha sido automatizada en su mayor parte por las instituciones financieras en un procedimiento de varias etapas que a menudo se basa en normas o análisis de riesgos. Si se encuentra una actividad sospechosa, se pasa a la capa de alerta y eventos para una inspección en más detalle. El aprovechamiento de los datos en redes sociales y la web para adquirir información para la investigación está poco desarrollado en los sistemas actuales de lucha contra el blanqueo de capitales. Un analista humano toma la decisión de bloquear o aprobar una transacción en la capa de operaciones.

Procesamiento de lenguaje natural, ingeniería ontológica, aprendizaje automático, aprendizaje profundo y análisis de sentimiento

El procesamiento de lenguaje natural (PLN) y la ingeniería ontológica, ambos campos de la inteligencia artificial, pueden ayudar a aliviar la carga de trabajo al proporcionar a los expertos humanos una valoración y una visualización de las relaciones, basadas en datos de las noticias: por ejemplo, la base de datos de noticias de los bancos y las fuentes de noticias tradicionales o de las redes sociales en relación con la posible entidad defraudadora. Un enfoque para identificar el blanqueo de capitales consiste en definir un grafo de conocimiento relativo a las entidades. El reconocimiento de entidades es un conjunto de algoritmos capaces de reconocer las entidades pertinentes; a saber, personas, cargos y empresas mencionadas en una cadena de texto de entrada. La extracción de relaciones detecta la relación entre dos entidades nombradas (e1 , e2) en una oración dada, típicamente expresada como un triplete [ e1 , r, e2 ] donde r es una relación entre e1 y e2. La resolución de entidades determina si las referencias a las entidades mencionadas en diversos registros y documentos se refieren a la misma o a diferentes entidades. Por ejemplo, una misma persona puede ser mencionada de diferentes maneras, y una organización podría tener diferentes direcciones. Los principales desafíos en el aprendizaje de grafos para la lucha contra el blanqueo de capitales son la velocidad de aprendizaje/análisis de grafos y el tamaño de los mismos. El aprendizaje rápido de grafos utiliza redes neuronales convolucionales rápidas, y aumenta drásticamente las velocidades de entrenamiento en comparación con las redes neuronales convolucionales convencionales. El análisis de relaciones, de sentimiento y muchas otras técnicas basadas en el PLN y los grafos de conocimiento se utilizan a menudo para reducir los altos índices de falsos positivos en la lucha contra el blanqueo de capitales.

Otra manera de enmarcar la IA y la minería de datos en la lucha contra el blanqueo de capitales es a través de la detección de anomalías mediante técnicas de aprendizaje automático. De conformidad con este método, en primer lugar se define lo que sería una transacción normal o típica y luego se detecta cualquier transacción que sea lo suficientemente diferente como para ser considerada como anómala. Se define un grupo de elementos comunes, a fin de captar los hábitos de gasto típicos de un cliente. La agrupación es un método estándar para definir los grupos de elementos comunes; a continuación, se calcula una distancia entre las transacciones entrantes y los grupos de elementos comunes con el ánimo de detectar comportamientos anómalos, por ejemplo, mediante el algoritmo de agrupación k-medias (k-means, según su denominación en inglés).

El salto adelante significativo se ha producido al utilizar, en contraste con los enfoques convencionales de aprendizaje automático, métodos de aprendizaje profundo para aprender representaciones de características a partir de datos en bruto. En las técnicas de aprendizaje profundo, se aprenden múltiples capas de representación a partir de una capa de entrada de datos en bruto, utilizando manipulaciones no lineales en cada nivel de aprendizaje de la representación. El PLN y el aprendizaje profundo ya se utilizan en muchos niveles de cumplimiento normativo de la lucha contra el blanqueo de capitales.

La implementación de análisis de sentimiento puede ser útil también para la lucha contra el blanqueo de capitales. Entendido dicho análisis como una tarea de clasificación masiva, mediante PLN, de documentos de manera automática en función de la connotación positiva o negativa del lenguaje del documento, su función principal es acortar el período de investigación por parte de un responsable de cumplimiento normativo. Puede aplicarse en diferentes niveles, incluidas las etapas de gestión de atrasos, incorporación de clientes y supervisión del perfil de los mismos. El objetivo de un sistema de análisis de sentimiento en este contexto es vigilar las tendencias de sentimiento asociadas con un cliente, para identificar patrones importantes. Cuando los investigadores de la lucha contra el blanqueo de capitales identifican una empresa que ha participado potencialmente en una transacción sospechosa, generalmente consultan Internet para obtener pruebas. El análisis de los niveles de sentimiento de las noticias relativas a una organización específica puede revelar una gran cantidad de pruebas. El análisis de sentimiento basado en el PLN puede examinar miles de artículos en segundos, mejorando significativamente el proceso de investigación en términos de eficiencia y precisión. El análisis de sentimiento también puede emplearse en el proceso de monitoreo del perfil del cliente y de la incorporación del mismo, con el ánimo de investigar e identificar puntos débiles específicos de un cliente y sus vinculaciones con artículos negativos. En términos de IA, se han utilizado numerosas técnicas para el análisis de sentimiento, entre ellas las máquinas de vectores de soporte, los campos aleatorios condicionales (conditional random fields, según su denominación en inglés) y las redes neuronales profundas como las redes neuronales convolucionales y las redes neuronales recurrentes.

Métodos explicables de inteligencia artificial

La eficacia de los sistemas de IA está limitada en cierta medida por su capacidad para explicar una decisión específica que se ha tomado o predicho. La naturaleza de la explicación varía según las diferencias de los datos y los algoritmos, y hasta ahora no se ha implementado ningún marco común o estándar de explicación.

La comunicación con los analistas es de suma importancia cuando se diseña cualquier sistema de lucha contra el blanqueo de capitales, puesto que los usuarios toman la decisión final. Los métodos explicables de IA funcionan proporcionando a los usuarios información clara sobre por qué se hizo una predicción: por ejemplo, por qué el sistema cree que una transacción es sospechosa, a fin de ayudar a los usuarios a tomar una decisión y fomentar la comprensión de la tecnología por parte de los mismos. Es importante que cualquier sistema pueda explicar sus decisiones de manera sencilla para el usuario. Las políticas europeas y el Reglamento General de Protección de Datos hacen hincapié en la necesidad de que las instituciones financieras proporcionen decisiones explicables y autorizadas por el ser humano. Es fundamental que cualquier método de lucha contra el blanqueo de capitales incorpore un analista humano y garantice que éste comprenda claramente la información que se le presenta. Un sistema de “caja negra” que etiqueta una transacción como “fraudulenta” sin ningún tipo de explicación o argumentación es inaceptable.

Finalmente, en este futuro entorno de trabajo común, una decisión final tomada por el humano, que puede o no apoyar la predicción del sistema, debería ser retro-propagada al modelo de IA para mejorar su capacidad de toma de decisiones. Los sistemas de lucha contra el blanqueo de capitales no deberían ser lineales sino cíclicos, en los que los modelos de IA se comuniquen y aprendan de los analistas. Sólo a través de este esfuerzo conjunto de los seres humanos y la inteligencia artificial los procedimientos de lucha contra el blanqueo de capitales lograrán un éxito excepcional.

Growing in Switzerland: the Example of expert.ai

Posted: November 11th, 2020 | Author: Domingo | Filed under: Artificial Intelligence | Tags: AI, artificial intelligence, expert.ai, hAItta, ia, inteligencia artificial | Comments Off

In 2020, the AI Forum Live was born. This comprehensive digital event is bringing together AI leaders and experts to learn more about cutting-edge artificial intelligence strategies and solutions. Organised by Associazione Italiana per l’Intelligenza Artificiale (AIxIA), which promotes the study and research of AI, this live forum is reuniting the world of research with that of businesses in hopes of building promising new collaborations.

Expert.ai Managing Director – Switzerland Domingo Senise de Gracia, will partake in a workshop on November 3^rd at 12.30 pm CET to discuss expert.ai’s international expansion and new venture in Switzerland. The presentation will share the main challenges and opportunities expert.ai considered when choosing Switzerland as a strategic environment to leverage and deploy its AI approach.

DomingoSenise.com

Large Language Models (LLMs): an Ontological Leap in AI

Explainable Artificial Intelligence: A Main Foundation in Human-centered AI

To Overcome the Reluctance for Accepting AI, We Must Highlight the Gains in Terms of Productivity and Efficiency, Using Plain Language.

Inteligencia artificial para luchar contra el blanqueo de capitales

Growing in Switzerland: the Example of expert.ai

Subjects