Oren Glickman

Oren Glickman
  • Doctor of Philosophy
  • Professor (Assistant) at Bar Ilan University

About

37
Publications
8,597
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
2,985
Citations
Introduction
Natural Language Processing
Skills and Expertise
Current institution
Bar Ilan University
Current position
  • Professor (Assistant)

Publications

Publications (37)
Article
Full-text available
Wildfires pose a significant natural disaster risk to populations and contribute to accelerated climate change. As wildfires are also affected by climate change, extreme wildfires are becoming increasingly frequent. Although they occur less frequently globally than those sparked by human activities, lightning-ignited wildfires play a substantial ro...
Preprint
Full-text available
Deep learning (DL) models have gained prominence in domains such as computer vision and natural language processing but remain underutilized for regression tasks involving tabular data. In these cases, traditional machine learning (ML) models often outperform DL models. In this study, we propose and evaluate various data augmentation (DA) technique...
Article
Full-text available
Machine learning (ML) and deep learning (DL) models are gaining popularity due to their effectiveness in many computational tasks. These models are based on an intuitive, but frequently unsatisfied, assumption that the data used to train these models is well-representing the task at hand. This gives rise to the out-of-distribution (OOD) challenge w...
Preprint
Full-text available
Wildfires pose a significant natural disaster risk to populations and contribute to accelerated climate change. As wildfires are also affected by climate change, extreme wildfires are becoming increasingly frequent. Although they occur less frequently globally than those sparked by human activities, lightning-ignited wildfires play a substantial ro...
Preprint
Full-text available
The analysis of tabular datasets is highly prevalent both in scientific research and real-world applications of Machine Learning (ML). Unlike many other ML tasks, Deep Learning (DL) models often do not outperform traditional methods in this area. Previous comparative benchmarks have shown that DL performance is frequently equivalent or even inferio...
Preprint
Full-text available
We explore generating factual and accurate tables from the parametric knowledge of large language models (LLMs). While LLMs have demonstrated impressive capabilities in recreating knowledge bases and generating free-form text, we focus on generating structured tabular data, which is crucial in domains like finance and healthcare. We examine the tab...
Article
Full-text available
In the realm of machine and deep learning (DL) regression tasks, the role of effective feature engineering (FE) is pivotal in enhancing model performance. Traditional approaches of FE often rely on domain expertise to manually design features for machine learning (ML) models. In the context of DL models, the FE is embedded in the neural network’s a...
Article
Full-text available
Near-surface air temperature (Ta) is a key variable in global climate studies. Global climate models such as ERA5 and CMIP6 predict various parameters at coarse spatial resolution (>9 km). As a result, local phenomena such as the urban heat islands are not reflected in the model’s outputs. In this study, we address this limitation by downscaling th...
Article
Fire risk mapping-mapping the probability of fire occurrence and spread-is essential for pre-fire management as well as for efficient firefighting efforts. Most fire risk maps are generated using static information on variables such as topography, vegetation density, and fuel instantaneous wetness. Satellites are often used to provide such informat...
Patent
Full-text available
A system for generating a demographic profile for a set of at least one webpage, the system comprising a webpage audience information gatherer operative for providing, for at least one webpage, training data including demographic information characterizing an audience of the webpage; a predictor developing system operative to compute at least one c...
Patent
Full-text available
Systems and methods for extracting information from structured documents are provided. The systems and methods relate to selecting a centroid document from a group of structured documents, selecting a subset of the group of structured documents in order to form a cluster of the subset of documents about the centroid document. The selecting the subs...
Conference Paper
This paper investigates conceptually and empirically the novel sense matching task, which requires to recognize whether the senses of two synonymous words match in context. We suggest direct approaches to the problem, which avoid the intermediate step of explicit word sense disambigua- tion, and demonstrate their appealing ad- vantages and stimulat...
Conference Paper
Semantic lexical matching is a prominent subtask within text understanding applica-tions. Yet, it is rarely evaluated in a di-rect manner. This paper proposes a def-inition for lexical reference which cap-tures the common goals of lexical match-ing. Based on this definition we created and analyzed a test dataset that was uti-lized to directly evalu...
Conference Paper
Semantic lexical matching is a prominent subtask within text understanding applications. Yet, it is rarely evaluated in a direct manner. This paper proposes a definition for lexical reference which captures the common goals of lexical matching. Based on this definition we created and analyzed a test dataset that was utilized to directly evaluate, c...
Article
Full-text available
This paper investigates an isolated setting of the lexical substitution task of replacing words with their synonyms. In particular, we examine this problem in the setting of subtitle generation and evaluate state of the art scoring methods that predict the validity of a given substitution. The paper evaluates two context independent models and two...
Article
Full-text available
In this paper we define two intermediate models of textual entailment, which correspond to lexical and lexical-syntactic levels of representation. We manually annotated a sample from the RTE dataset according to each model, compared the outcome for the two models, and explored how well they approximate the notion of entailment. We show that the lex...
Conference Paper
Full-text available
The textual entailment problem is to determine if a given text entails a given hypothesis. This paper describes first a general generative probabilistic setting for textual entailment. We then focus on the sub-task of recognizing whether the lexical con-cepts present in the hypothesis are entailed from the text. This problem is recast as one of tex...
Conference Paper
Full-text available
This paper describes the Bar-Ilan system participating in the Recognising Textual Entailment Challenge. The paper proposes first a general probabilistic setting that formalizes the notion of textual en- tailment. We then describe a concrete alignment-based model for lexical entailment, which utilizes web co-occurrence statistics in a bag of words r...
Conference Paper
Full-text available
This paper describes the PASCAL Net- work of Excellence Recognising Textual Entailment (RTE) Challenge benchmark 1. The RTE task is defined as recognizing, given two text fragments, whether the meaning of one text can be inferred (en- tailed) from the other. This application- independent task is suggested as capturing major inferences about the var...
Article
Full-text available
The textual entailment task – determining if a given text entails a given hypothesis – provides an abstraction of applied semantic inference. This paper describes first a general generative probabilistic setting for textual entailment. We then focus on the sub-task of recognizing whether the lexical concepts present in the hypothesis are entailed f...
Article
This paper proposes a general probabilistic setting that formalizes a probabilistic notion of textual entailment. We further describe a particular preliminary model for lexical-level entailment, based on document cooccurrence probabilities, which follows the general setting. The model was evaluated on two application independent datasets, suggestin...
Article
Full-text available
This paper proposes a general probabilis-tic setting that formalizes the notion of textual entailment. In addition we de-scribe a concrete model for lexical en-tailment based on web co-occurrence statistics in a bag of words representation.
Article
Full-text available
This paper studies the potential of identifying lexical paraphrases within a single corpus, focusing on the extraction of verb paraphrases. Most previous approaches detect individual paraphrase instances within a pair (or set) of comparable corpora, each of them containing roughly the same information, and rely on the substantial level of correspon...
Article
This paper studies the potential of identifying lexical paraphrases within a single corpus, fo-cusing on the extraction of verb paraphrases. Most previous approaches detect individual paraphrase instances within a pair (or set) of "comparable" corpora, each of them contain-ing roughly the same information, and rely on the substantial level of corre...
Article
Full-text available
: We report on techniques for using discourse context to reduce ambiguity and improve translation accuracy in a multi-lingual (Spanish, German, and English) spoken language translation system. The techniques involve statistical models as well as knowledge-based models including discourse plan inference. This work is carried out in the context of th...
Article
All components of a typical IE system have been the object of some machine learning research, motivated by the need to improve time taken to transfer to new domains. In this paper we survey such methods and assess to what extent they can help create a complete IE system that can be easily adapted to new domains. We also lay out a general prescripti...
Article
Full-text available
The goal of this work is to use phonetic recognition to drive a synthetic image with speech. Phonetic units are identified by the phonetic recognition engine and mapped to mouth gestures, known as visemes, the visual counterpart of phonemes. The acoustic waveform and visemes are then sent to a synthetic image player, called FaceMe! where they are r...
Article
Full-text available
In this paper we investigate the possibility of translating continuous spoken conversations in a cross-talk environment. This is a task known to be difficult for human translators due to several factors. It is characterized by rapid and even overlapping turn-taking, a high degree of co-articulation, and fragmentary language. We describe experiments...
Conference Paper
We investigate the possibility of translating continuous spoken conversations in a cross talk environment. This is a task known to be difficult for human translators due to several factors. It is characterized by rapid and even overlapping turn taking, a high degree of coarticulation, and fragmentary language. We describe experiments using both pus...
Article
Full-text available
A most prominent phenomenon of natural lan-guages is variability-stating the same meaning in various ways. Robust language processing applica-tions-like Information Retrieval (IR), Question Answering (QA), Information Extraction (IE), text summarization and machine translation-must recognize the different forms in which their inputs and requested o...

Network

Cited By