publications | Mirerfan Gheibi

2021

DAL
Helping Biologists Find Whales: AI-in-the-Loop Support for Environmental Dataset Creation

Mirerfan Gheibi

Dalhousie University, 2021

Abs Bib HTML

We develop a computer vision system to help biologists detect endangered whales. Given access to a limited dataset of aerial imagery (1544 images of mainly water), we implemented object detection and semantic segmentation models. For segmentation, we leverage the extreme data imbalance by introducing an elliptic annotation mechanism mitigating the need for tight annotations while still constrained by expert annotators’ available time. Data scarcity made zero-false-negative rate infeasible, so we minimized false negatives while having few enough false positives that it could still help an expert annotator accelerate the annotation process itself. This would allow a bootstrapping dataset creation approach: collecting increasingly larger datasets in parallel with training increasingly accurate models. We evaluated performance for the downstream bootstrapping task with an AI-in-the-Loop experiment. Motivated by the expert user’s workflow, this required developing a feature-based clustering visualization of the images. Our segmentation system admitted few false negatives and was more efficient than manually data collection alone. While the proposed approach cannot entirely solve the challenge of the extremely small dataset, it suggests that a slightly larger dataset (e.g. adding 100 whale images would double the relevant training set) may be sufficient to bootstrap the training and collection with effectively no false negatives.
@mastersthesis{gheibi2021AIITL, title = {Helping Biologists Find Whales: AI-in-the-Loop Support for Environmental Dataset Creation}, author = {Gheibi, Mirerfan}, year = {2021}, school = {Dalhousie University} }

2020

CAIAC
CB-DBSCAN: A Novel Clustering Algorithm for Adjacent Clusters with Different Densities.

Gashin Ghazizadeh, Mirerfan Gheibi, and Stan Matwin

In Canadian Conference on AI, 2020

Abs Bib HTML Slides

Density-based clustering is well-known for finding clusters that have different shapes and sizes, but they have unsatisfactory results on adjacent clusters with different densities. In this paper, we propose a novel algorithm that combines DBSCAN with centroid-based algorithms to address this issue. Our algorithm uses DBSCAN to form mini-clusters, which will be merged based on their density and center distances. We test the new algorithm on synthetic and real datasets to show the significant improvement in the results.
@inproceedings{ghazizadeh2020cb, title = {CB-DBSCAN: A Novel Clustering Algorithm for Adjacent Clusters with Different Densities.}, author = {Ghazizadeh, Gashin and Gheibi, Mirerfan and Matwin, Stan}, booktitle = {Canadian Conference on AI}, pages = {232--237}, year = {2020}, }

2019

SUT
GPU-based Acceleration of Isogeny-based Cryptography

Mirerfan Gheibi

Sharif University of Technology, 2019

Abs Bib HTML

Post-quantum cryptography, as one of the newest groups of cryptographic algorithms, is thought to be secure against most sophisticated attacks by the groundbreaking quantum computers. Isogeny-based cryptography is an appealing contender among them due to its exceptional characteristics, especially the shortest public key in key encapsulation, encryption and decryption amid the other nominees of NIST post-quantum standard. However, its high computational complexity is a significant drawback. This research aims to increase the performance of isogeny-based cryptography in the most compute-intensive part, both in throughput and latency perspectives on GPUs and CPUs, which are the most widespread off-the-shelf processors. A considerable part of computation in the isogeny-based cryptography relates to the high-degree isogeny computation. In this thesis, there are several high-performance implementations of a parallel approach to isogeny evaluation. The GPU implementation is the first of its kind and reaches up to 44 improvement in terms of throughput in comparison to the fastest software implementation of isogeny-based cryptography
@mastersthesis{gheibi2019GPUaccelCrypto, title = {GPU-based Acceleration of Isogeny-based Cryptography}, author = {Gheibi, Mirerfan}, year = {2019}, school = {Sharif University of Technology} }

2017

IEEEE_IR

Proposing a tokenizer for Farsi words, by using regular expressions (The paper was originally written in Persian)

Ali Reihanian, Mohammad-Reza Feizi-Derakhshi, Ali-Reza Feizi-Derakhshi, and Mirerfan Gheibi

In 5th International Conference on Electrical Engineering and Computer with emphasis on indigenous knowledge, 2017

Abs HTML

This abstract is translated from the original abstract of the paper, written in Persian: This paper presents a novel word tokenizer that utilizes regular expressions to split words in a given text. The tokenizer is built upon the concept of replaceability in regular expressions. The proposed method is capable of accurately recognizing and processing various elements such as Farsi and English words, symbols, and other unique expressions. The algorithm aims to effectively identify and isolate words while accounting for their respective frequencies. Consequently, the output of the system includes the processed text, a word count with repetition (Words), a distinct vocabulary count (Vocabulary), and a list that presents each word alongside its frequency of occurrence. This list is sorted alphabetically and by frequency to provide an efficient summary of the processed text. The tokenization process is crucial for natural language processing applications, and this novel tokenizer offers an effective and adaptable solution.