MTech Projects
  • HOME
  • MTECH PROJECTS
    • COMPUTER SCIENCE
      • MTech Python Projects
        • Machine Learning Projects
        • Deep Learning Projects
        • Blockchain Projects
        • django Projects
      • MTech Java Projects
        • Cloud Computing Projects
        • Data Mining Projects
        • Mobile Computing Projects
        • Networking Projects
      • MTech NS2 Projects
        • Wireless Communication Projects
        • Vehicular Technology Projects
      • MTech Hadoop Projects
      • MTech Android Projects
    • ELECTRONICS
      • MTech DSP Projects
      • MTech DIP Projects
      • MTech VLSI Projects
      • MTech Communication Projects
    • ELECTRICAL
      • MTech Power Systems Projects
      • MTech Power Electronics Projects
      • MTech Control Systems Projects
    • OTHER
      • Chemical Projects
      • Mechanical Projects
      • All Other Projects
  • EMBEDDED KITS
    • MTech Embedded Kits
    • BTech Embedded Kits
  • PROJECTS+
  • PUBLISHING
    • Research Publishing
    • Authors Guidelines
    • Publishing Policy
  • CONTACT US

Contact Us

  • Street Number 4, Jawahar Nagar, RTC X Road, Hyderabad 500044
  • +91 9573777164
  • info@mtechprojects.com

Welcome to MTech Projects - Online Projects for MTech Students

  • My Account
  • Careers
  • Downloads
  • Blog
MTech Projects
  • Email Us
  • Phone Number
  • Open Hours
  • HOME
  • MTECH PROJECTS

    MTech Python Projects

    • Machine Learning Projects
    • Deep Learning Projects
    • Blockchain Projects
    • django Projects

    MTECH JAVA PROJECTS

    • Cloud Computing Projects
    • Data Mining Projects
    • Mobile Computing Projects
    • Networking Projects

    MTECH NS2 PROJECTS

    • Wireless Communication Projects
    • Vehicular Technology Projects
    • MTech Hadoop Projects
    • MTech Android Projects

    ELECTRONICS

    • MTech DSP Projects
    • MTech DIP Projects
    • MTech VLSI Projects
    • MTech Communication Projects

    ELECTRICAL

    • MTech Power Systems Projects
    • MTech Power Electronics Projects
    • MTech Control Systems Projects

    OTHER

    • Chemical Projects
    • Mechanical Projects
    • All Other Projects
  • EMBEDDED KITS
    • MTech Embedded Kits
    • BTech Embedded Kits
  • PROJECTS+
  • PUBLISHING
    • Research Publishing
    • Authors Guidelines
    • Publishing Policy
  • CONTACT US

Project Enquiry

  1. You are here:  
  2. Home
  3. MTech Machine Learning Projects
  4. CuWide: Towards Efficient Flow-based Sparse Wide Models Training on GPUs
Details
Category: MTech Machine Learning Projects
By MTech Projects
MTech Projects
02.May
Hits: 10

CuWide: Towards Efficient Flow-based Sparse Wide Models Training on GPUs

PROJECT TITLE :

CuWide Towards Efficient Flow-based Training for Sparse Wide Models on GPUs

ABSTRACT:

Numerous predictive applications, such as recommendation, CTR prediction, and image recognition, have made extensive use of wide models, such as generalized linear models and factorization-based models. The performance improvement on the CPU is reaching its limit as a result of the memory bounded property of the models. The graphics processing unit (GPU), which is known to have a large number of computation units as well as a high memory bandwidth, becomes an attractive platform for the training of machine learning models. On the other hand, due to the sparsity and irregularity of wide models, the GPU training for these models is not even close to being the best it can be. The currently available GPU-based wide models are even more sluggish than those that are processed by the CPU. The traditional training schema for wide models is not optimized for the GPU architecture, so it generates a large number of random memory accesses and performs redundant reads and writes of intermediate values. This is a problem for the GPU because it suffers from these issues. In this article, we propose a GPU-training framework for large-scale wide models that we call cuWide. It is both effective and efficient. cuWide uses a new flow-based schema for training, which takes advantage of the spatial and temporal locality of wide models to drastically cut down on the amount of communication with GPU global memory. This allows cuWide to derive maximum benefit from the memory hierarchy of the GPU, which is accomplished by using cuWide. In order to accomplish this, we use a bigraph computation model to effectively realize the flow-based schema, and we take advantage of three flexible programming interfaces. To further optimize GPU memory access for sparse data, we use a 2D partition of mini-batch (in sample and feature dimensions) in conjunction with a proposed graph abstraction. Additionally, we implement several spatial-temporal caching mechanisms (importance-based model caching and cross-stage accumulation caching mechanisms) in order to achieve a high performance kernel. We also propose several GPU-oriented optimizations as a means of effectively implementing cuWide. These include a feature-oriented data layout as a means of improving data locality; a replication mechanism as a means of reducing update conflicts in shared memory; and multi-stream scheduling as a means of overlapping data transfer and kernel computing. We demonstrate that cuWide is capable of being up to more than 20 times faster than the most cutting-edge GPU solutions and multi-core CPU solutions.

Did you like this research project?

To get this research project Guidelines, Training and Code... Click Here

  • With Nominal and Ordinal Attributes, Learnable Weighting of Intra-Attribute Distances for Categorical Data Clustering
  • Generative Segmented Networks Production of Data in the Uniform Probability Space
  • For Inductive Semi-Supervised Learning Over Large-Scale Graphs, GAIN stands for Graph Attention & Interaction Network.
  • Dimensionality Reduction Using Adaptive Local Embedding Learning in a Semi-Supervised Environment
  • Approximation of Dynamic Double Classifiers for Cross-Domain Recognition
  • Cold-start Recommendation via Deep Pairwise Hashing
  • Personalized and context-aware multi-modal transportation recommendation using data from multiple urban sources
  • Scale Invariant Face Detection Using Group Sampling
  • Acceleration of Nonsmooth Convex Optimization with Constraints Individual Convergence
  • Methods and Techniques for Hypergraph Learning
Previous article: A Survey on Database and Artificial Intelligence A Survey on Database and Artificial Intelligence Next article: Multiview Sequential Data Modeling with Conditional Random Fields Multiview Sequential Data Modeling with Conditional Random Fields
COMPUTER SCIENCE PROJECTS ELECTRONICS PROJECTS ELECTRICAL PROJECTS EMBEDDED PROJECTS MECHANICAL PROJECTS

sell academic m.tech, btech and be projects online

sell academic m.tech, btech and be projects online

Academic Final Year Projects

QUICK LINKS

  • Python Projects List
  • Java Projects with Source Code in NetBeans
  • Android Projects Download
  • Core Java Projects
  • Simple Python Projects
  • Android Projects with Source Code in Android Studio
  • Segmentation in Image Processing
  • Python Projects with Database
  • Digital Signal Processing pdf
  • Image Processing Using Python
  • VLSI Projects for Final Year ECE
  • Power Electronic Projects
  • Power System Projects
  • VLSI Projects for MTech
  • Power System Projects using Matlab
  • Power Electronics and Drives
SUPPORT
+91 9573777164
9:00am - 6:00pm IST
info@mtechprojects.com

Navigate

  • ABOUT
  • TESTIMONIALS
  • FIND A DEALER
  • CAREERS

CONTACT

  • CONTACT
  • FAQ
  • RESOURCES
  • EMAIL US

Useful links

  • REFUND & RETURN POLICY
  • PRIVACY POLICIES

Support

  • FACEBOOK
  • TWITTER
  • PINTEREST
  • GOOGLE PLUS

Disclaimer : MTech Projects, is not associated or affiliated with IEEE, in any way. The mentioned IEEE Projects here are student projects inspired by ideas from IEEE publications, not projects conducted by or associated with IEEE.

Talk to us?

Copyright © 2026 MTech Projects. All Rights Reserved.