Andong Tan

Andong Tan

Germany
282 followers 280 connections

Activity

Join now to see all activity

Education

  • The Hong Kong University of Science and Technology Graphic
  • -

    -

  • 1.0

    -

    Activities and Societies: Workshop on integration of electric devices into soft materials

  • -

    -

    Bachelor thesis with the highest score (1.0).
    An IEEE Paper published based on this thesis

Volunteer Experience

Publications

  • Unsupervised Domain Adaptive Object Detection with Class Label Shift Weighted Local Features

    ECCV 2022 workshops

    Due to the high transferability of features extracted from early layers (called local features), aligning marginal distributions of local features has achieved compelling results in unsupervised domain adaptive object detection. However, such marginal feature alignment suffers from the class label shift between source and target domains. Existing class label shift correction methods focus on image classification, and cannot be directly applied to object detection due to objects’ co-occurrence…

    Due to the high transferability of features extracted from early layers (called local features), aligning marginal distributions of local features has achieved compelling results in unsupervised domain adaptive object detection. However, such marginal feature alignment suffers from the class label shift between source and target domains. Existing class label shift correction methods focus on image classification, and cannot be directly applied to object detection due to objects’ co-occurrence. Meanwhile, one property of local features is that they have small receptive fields and can be easily mapped back to specific areas of input images. Therefore, to handle object co-occurrence scenarios, we propose to leverage this property to decompose the source feature maps and compute the source domain class distribution at the pixel level. The decomposition is based on each feature pixel’s receptive field overlap with ground- truth bounding boxes. In the target domain, where no labels are available, we estimate this distribution using predicted bounding boxes and thus get the estimated class label shift between domains. This estimated shift is further used to re-weight source local features during the feature alignment. To the best of our knowledge, this is the first work trying to explicitly correct class label shift in unsupervised domain adaptive object detection. Experimental results demonstrate that this approach can systematically improve several recent domain adaptive object detectors, such as SW and HTCN on benchmark datasets with different degrees of class label shift.

    See publication
  • CycleHand: Increasing 3D Pose Estimation Ability on In-the-wild Monocular Image through Cyclic Flow

    Proceedings of the 30th ACM International Conference on Multimedia

    Current methods for 3D hand pose estimation fail to generalize well to in-the-wild new scenarios due to varying camera viewpoints, self-occlusions, and complex environments. To address this prob-
    lem, we propose CycleHand to improve the generalization ability of the model in a self-supervised manner. Our motivation is based on an observation: if one globally rotates the whole hand and reversely rotates it back, the estimated 3D poses of fingers should keep consistent before and after the…

    Current methods for 3D hand pose estimation fail to generalize well to in-the-wild new scenarios due to varying camera viewpoints, self-occlusions, and complex environments. To address this prob-
    lem, we propose CycleHand to improve the generalization ability of the model in a self-supervised manner. Our motivation is based on an observation: if one globally rotates the whole hand and reversely rotates it back, the estimated 3D poses of fingers should keep consistent before and after the rotation because the wrist-relative hand poses stay unchanged during global 3D rotation. Hence, we propose arbitrary-rotation self-supervised consistency learning to improve
    the model’s robustness for varying viewpoints. Another innovation of CycleHand is that we propose a high-fidelity texture map to render the photorealistic rotated hand with different lighting conditions, backgrounds, and skin tones to further enhance the effectiveness of our self-supervised task. To reduce the potential negative effects brought by the domain shift of synthetic images, we use the idea of contrastive learning to learn a synthetic-real consistent feature extractor in extracting domain-irrelevant hand representations. Experiments show that CycleHand can largely improve the hand pose estimation performance in both canonical datasets and real-world applications.

    See publication
  • A Real World Information-Centric Connected Vehicle Testbed Supporting ETSI ITS-G5

    2018 European Conference on Networks and Communications (EuCNC): Operational & Experimental Insights (OPE)

    Inter-Vehicle Communication will play an important role in upcoming Intelligent Transportation Systems (ITS). Vehicles are equipped with communication units, able to offer and share information with other cars, infrastructure components or cloud servers. In recent years, researchers in academia and industry worked towards ITS standards to tackle the challenges set by connected vehicle environments such as high latency or communication failures. However, the host-centric communication
    model…

    Inter-Vehicle Communication will play an important role in upcoming Intelligent Transportation Systems (ITS). Vehicles are equipped with communication units, able to offer and share information with other cars, infrastructure components or cloud servers. In recent years, researchers in academia and industry worked towards ITS standards to tackle the challenges set by connected vehicle environments such as high latency or communication failures. However, the host-centric communication
    model of today’s networks complicates the exchange of information, especially in vehicle-to infrastructure (V2I) and vehicle-to-cloud (V2C) communication. The Information-Centric
    Networking (ICN) paradigm is a promising candidate to solve the challenges set by connected vehicles. Addressing data by name instead of the location, and the resulting capabilities such as in-network caching, make it a good fit for scenarios which are characterized by a high degree of mobility. In this
    paper, we propose an architectural concept in which ICN and the inter-vehicle communication system ETSI ITS-G5 (based on IEEE 802.11p) coexist and complement each other. Based on the OpenC2X open source platform, we introduce a prototype implementation and verified the prototype within a real world testbed.

    Other authors
    • Dennis Grewe
    • Marco Wagner
    • Sebastian Schildt
    • Hannes Frey
    See publication
  • Explicitly Modeled Attention Maps for Image Classification

    AAAI Conference on Artificial Intelligence 2021

    Self-attention networks have shown remarkable progress in computer vision tasks such as image classification. The main benefit of the self-attention mechanism is the ability to capture long-range feature interactions in attention-maps. However, the computation of attention-maps requires a learnable key, query, and positional encoding, whose usage is often not intuitive and computationally expensive. To mitigate this problem, we propose a novel self-attention module with explicitly modeled…

    Self-attention networks have shown remarkable progress in computer vision tasks such as image classification. The main benefit of the self-attention mechanism is the ability to capture long-range feature interactions in attention-maps. However, the computation of attention-maps requires a learnable key, query, and positional encoding, whose usage is often not intuitive and computationally expensive. To mitigate this problem, we propose a novel self-attention module with explicitly modeled attention-maps using only a single learnable parameter for low computational overhead. The design of explicitly modeled attention-maps using geometric prior is based on the observation that the spatial context for a given pixel within an image is mostly dominated by its neighbors, while more distant pixels have a minor contribution. Concretely, the attention-maps are parametrized via simple functions (e.g., Gaussian kernel) with a learnable radius, which is modeled independently of the input content. Our evaluation shows that our method achieves an accuracy improvement of up to 2.2% over the ResNet-baselines in ImageNet ILSVRC and outperforms other self-attention methods such as AA-ResNet152 in accuracy by 0.9% with 6.4% fewer parameters and 6.7% fewer GFLOPs. This result empirically indicates the value of incorporating geometric prior into self-attention mechanism when applied in image classification.

    See publication

Patents

Languages

  • Deutsch

    Full professional proficiency

  • Chinesisch

    Native or bilingual proficiency

  • Englisch

    Full professional proficiency

View Andong’s full profile

  • See who you know in common
  • Get introduced
  • Contact Andong directly
Join to view full profile

Other similar profiles

Explore collaborative articles

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Andong Tan